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Preface 


Preface to the Second English edition (2007).© 
This is Version 1.3: June 15, 2010 


In 2007 I recovered the Copyright. This is a new version that follows closely 
the first edition by Springer-Verlag. I made very few changes. Among them 
the Gauss’ method, already inserted in the second Italian edition, has been 
included here. Believing that my knowledge of the English language has im- 
proved since the late ’970’s I have changed some words and constructions. 

This version has been reproduced electronically (from the first edition) and 
quite a few errors might have crept in; they are compensated by the corrections 
that I have been able to introduce. This version will be updated regularly and 
typos or errors found will be amended: it is therefore wise to wait sometime 
before printing the file; the versions will be updated and numbered. The ones 
labeled 2.* or higher will have been entirely proofread at least once. 

As owner of the Copyright I leave this book on my website for free down- 
loading and distribution. Optionally the colleagues who download the book 
could send me a one line message (saying “downloaded”, at least): I will be 
grateful. Please signal any errors, or sources of unhappiness, you spot. 

On the web site I also put the codes that generate the non trivial figures 
and which provide rough attempts at reproducing results whose originals are 
in the quoted literature. Discovering the phenomena was a remarkable achieve- 
ment: but reproducing them, having learnt what to do from the original works, 
is not really difficult if a reasonably good computer is available. 

Typeset with the public Springer-Latex macros. 


Giovanni Gallavotti Roma 18, August 2007 


Copyright owned by the Author 
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Preface to the first English edition 


The word ”elements” in the title of this book does not convey the impli- 
cation that its contents are ”elementary” in the sense of ”easy”: it mainly 
means that no prerequisites are required, with the exception of some basic 
background in classical physics and calculus. 

It also signifies ” devoted to the foundations” . In fact, the arguments chosen 
are all very classical, and the formal or technical developments of this century 
are absent, as well as a detailed treatment of such problems as the theory 
of the planetary motions and other very concrete mechanical problems. This 
second meaning, however, is the result of the necessity of finishing this work 
in a reasonable amount of time rather than an a priori choice. 

Therefore a detailed review of the ”few” results of ergodic theory, of the 
*many” results of statistical mechanics, of the classical theory of fields (elas- 
ticity and waves), and of quantum mechanics are also totally absent; they 
could constitute the subject of two additional volumes on mechanics. 

This book grew out of several courses on “Meccanica Razionale”, i.e., 
essentially, Theoretical Mechanics, which I gave at the University of Rome 
during the years 1975-1978. 

The subjects cover a wide range. Chapter 2, for example, could be used in 
an undergraduate course by students who have had basic training in classical 
physics; Chapters 3 and 4 could be used in an advanced course; while Chapter 
5 might interest students who wish to delve more deeply into the subject, and 
fit could be used in a graduate course. 

My desire to write a self-contained book that gradually proceeds from 
the very simple problems on the qualitative theory of ordinary differential 
equations to the more modem theory of stability led me to include arguments 
of mathematical analysis, in order to avoid having to refer too much to existing 
textbooks (e.g., see the basic theory of the ordinary differential equations in 
§2.2-§2.6 or the Fourier analysis in §2.13, etc.). 

I have inserted many exercises, problems, and complements which are 
meant to illustrate and expand the theory proposed in the text, both to avoid 
excessive size of the book and to help the student to learn how to solve theoret- 
ical problems by himself. In Chapters 2-4, I have marked with an asterisk the 
problems which should be developed with the help of a teacher; the difficulty 
of the exercises and problems grows steadily throughout the book, together 
with the conciseness of the discussion. 

The exercises include some very concrete ones which sometimes require 
the help of a programmable computer and the knowledge of some physical 
data. An algorithm for the solution of differential equations and some data 
tables are in Appendix O and Appendix P, respectively. 

The exercises, problems, and complements must be considered as an im- 
portant part of the book, necessary to a complete understanding of the theory. 


Preface 7 


In some sense they are even more important than the propositions selected 
for the proofs, since they illustrate several aspects and several examples and 
counterexamples that emerge from the proofs or that are naturally associated 
with them. 

I have separated the proofs from the text: this has been done to facilitate 
reading comprehension by those who wish to skip all the proofs without los- 
ing continuity. This is particularly true for the more mathematically oriented 
sections. Too often students tend to confuse the understanding of a mathemat- 
ical proposition with the logical contortions needed to put it into an objective, 
written form. So, before studying the proof of a statement, the student should 
meditate on its meaning with the help (if necessary) of the observations that 
follow it, possibly trying to read also the text of the exercises and problems 
at the end of each section (particularly in studying Chapters 3-5). 

The student should bear in mind that he will have understood a theorem 
only when it appears to be self-evident and as needing no proof at all (which 
means that its proof should be present in its entirety in his mind, obvious and 
natural in all its aspects and, if necessary, describable in all details). This level 
of understanding can be reached only slowly through an analysis of several 
exercises, problem, examples, and careful thought. 

I have illustrated various problems of classical mechanics, guided by the 
desire to propose always the analysis of simple rather than general cases. I 
have carefully avoided formulating ” optimal” results and, in particular, have 
always stressed (by using them almost exclusively) my sympathy for the only 
” functions” that bear this name with dignity, i.e., the C°-functions and the 
elementary theory of integration (” Riemann integration” ). 

I have tried to deal only with concrete problems which could be ” construc- 
tively” solved (i.e., involving estimates of quantities which could actually be 
computed, at least in principle) and I hope to have avoided indulging in purely 
speculative or mathematical considerations. I realize that I have not been en- 
tirely successful and I apologize to those readers who agree with this point 
of view without, at the same time, accepting mathematically non rigorous 
treatments. 

Finally, let me comment on the conspicuous absence of the basic elements 
of the classical theory of fluids. The only excuse that I can offer, other than 
that of non pertinence (which might seem a pretext to many), is that, perhaps, 
the contents of this book (and of Chapter 5 in particular) may serve as an 
introduction to this fascinating topic of mathematical physics. 

The final sections, §5.9-§5.12, may be of some interest also to non stu- 
dents since they provide a self-contained exposition of Arnold’s version of the 
Kolmogorov-Arnold-Moser theorem. 

This book is an almost faithful translation of the Italian edition, with the 
addition of many problems and §5.12 and with 85.5, §5.7, and 85.12 rewritten. 

I wish to thank my colleagues who helped me in the revision of the 
manuscript and I am indebted to Professor V. Franceschini for providing (from 
his files) the very nice graphs of §5.8. 
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I am grateful to Professor Luigi Radicati for the interest he showed in 
inviting me to write this book and providing the financial help from the Italian 
printer P. Boringhieri. 

The English translation of this work was partially supported by the 
” Stiftung Volkswagenwerk” through the IHES. 


Giovanni Gallavotti 
Roma, 27 December 1981 
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Phenomena Reality and models 


1.1 Statements 


The results of physical experiments are determined by observations based 
on the measurement of various entities, i.e. the association of well defined 
sequences of numbers with well defined sequences of events. 

The physical entities are “operationally defined”. This means that they 
are defined in terms of the operations used to construct the numbers that 
provide their “measure”. 

For instance, the sequence of operations necessary to measure the “dis- 
tance” between two given points P and Q in space consists in choosing a 
particular ruler and placing it on the straight line joining points P and Q, 
starting from P. Taking the endpoint of the ruler as the new starting point, 
the procedure is repeated n times until the endpoint of the ruler is superim- 
posed on Q. If the distance PQ is not an exact multiple of the length of the 
ruler, one may, after n such operations, reach a point Qn 4 Q preceding Q on 
the line PQ; and after n + 1 operations one may reach point Qn+1 following 
Q on the line PQ. Then one takes a new ruler “ten times shorter” and puts it 
on QnQ trying to match, as before, the second endpoint with Q. When this 
turns out to be impossible, one can, as in the first case, define a new point 
Qn, On QnQ and, then, take a third ruler ten times shorter than the second 
and repeat the operation. 

Thus, inductively, a number n + 0.nınə... (in decimal representation) is 
built which, by definition, is the measure of the distance between P and Q. 

The above sequence of operations appears well defined but, in fact, a care- 
ful analysis shows that it does not have the prerequisites to be considered a 
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mathematically precise definition. What, for instance is “space”, what is a 
“point”, what is a “ruler”? Is it possible to “divide” a ruler into parts, and 
infinitely often? 

The physicist is not too concerned (or, rather, not at all concerned) with 
such aspects of the question: he considers a physical entity well defined when- 
ever the empirical procedure necessary for its measurement is clear. 

A measurement procedure is considered to be clear when every observer 
is led to the same result when measuring the same physical entity. It should 
be stressed, however, that this is an empirical criterion perpetually subject to 
critique; thus physical entities which today are considered to be well defined 
may no longer be so in the future. 

Hence, the physicist, from his observations of nature, obtains a set of num- 
bers corresponding to the performance of some operations which are consid- 
ered to be “objectively defined”. Trying to organize such numbers coherently, 
the physicist often formulates “models”. 

In the attempt to organize coherently such numbers, the physicist formu- 
lates “models”: i.e. he associates well-defined mathematical structures with 
his measurements, and he tries to establish a (small) number of mathematical 
relationship among them. From such relationships new ones logically follow, 
which reinterpreted through the model, used inversely, may serve to predict 
new relations between various empirical measurements. 

The belief in the existence of good models motivated Galileo to write: 
“Philosophy is written in the great book which is always open before our eyes 
(I mean the universe) but it cannot be understood unless one first learns the 
language and distinguishes the characters in which it is written. It is a mathe- 
matical language and the characters are triangles, circles and other geometri- 
cal figures, without which it cannot be understood by the human mind; without 
them one would vainly wonder through a dark labyrinth” 1 

A mathematical model is considered satisfactory whenever it does not lead 
to contradictions with the experiments. If a contradiction occurs, the physicist 
dismisses the model as “wrong”; nevertheless, the mathematical construction 
built with it remains valid and is witness to an imperfect representation of 
nature. 

Strictly speaking there is no model which is not wrong: only models that 
have not yet been shown to be wrong exist. However, all “serious” models (such 
as the dynamics of point masses, the theory of relativity, quantum mechanics, 
electromagnetism, thermodynamics, statistical mechanics, etc.) have led, and 
still lead, to the formulation of extremely interesting mathematical problems. 
Furthermore, it often happens that the analysis of the mathematical properties 
of a “wrong” model helps in the formulation of the new “more elaborate” 
model that the physicist tries to set up as a substitute. 

A link between phenomena reality and mathematics can therefore be es- 
tablished as just described, through what has been called “a model”. However, 


1 G.Galilei, Il Saggiatore, p. 232, [20]. 
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it would be impossible to give a precise mathematical definition of the notion 
of a model because it is a rather empirical notion which can only be well 
understood through the analysis of several concrete cases. 


1.2 An example of a Model 


Consider the historically particularly important and significant case of the 
“mechanics of point masses”. Its construction from empirical observations 
will be briefly and concretely analyzed, presenting it as a model of one or 
several point masses subject to forces. 

The first statement (or “axiom”, to use a mathematical term) says that 
the point masses are in a three-dimensional Euclidean space R3 in which any 
point can be represented by its three coordinates with respect to an orthogonal 
reference system (O; i,j,k). The notation means that O is the origin and i,j,k 
are the three orthogonal unit vectors pointing along the x, y, z coordinate axes, 
respectively. 

Such an idealization has a clear mathematical meaning, but it appears to 
be unprovable in mathematical terms: it just renders the following empirical 
observation. 

In practice, a point in space is determined by measuring (often only in 
principle and with the ruler method described in §1.1) its distance from three 
orthogonal walls. It is to be remarked that all such operations are ordinarily 
considered well defined. 

A second statement (or “axiom” ) concerns “time” which, for the physicist, 
is the physical entity measured by a “clock” (classically described as a pen- 
dulum, although any more modern device will do as well). One assumes that 
time is an absolute “entity”: in other words, one states that, at least in prin- 
ciple it is possible to associate with every point in space a clock mechanically 
identical at every point, and, furthermore, to coordinate (“synchronize”) the 
clocks. 

This means that if P,P’ are two points and t,t’ are two chosen time 
instants t < t’ it is then possible to send a signal from P towards P’ leaving P 
at time t and reaching P’ at time t’ (as indicated by the local clocks in P and 
in P’, respectively); while, vice versa, if t > t’, the above operation should be 
impossible. 

A little thought makes it clear that the operational definition of a “system 
of synchronized clocks” is based on the empirical fact that it is possible to 
send signals with arbitrary speed. It is also clear that the notion of time is a 
phenomenological notion, far from being mathematically well posed. 

Accepting the point of view so far discussed, one is led to say that the math- 
ematical scheme, or model, representing the space-time continuum,where our 
observations take place, consists of a four-dimensional space: each of its points 
(x,y, z,t) represents a point seen in a Cartesian coordinate frame (O; i,j,k) 


4 1 Phenomena Reality and models 


(“laboratory”) and observed at the instant t (as measured by the formerly 
introduced universal clocks). 

Empirically, a point mass is any object which, at least as far as our obser- 
vations are concerned, can be assimilated with a point in space (for instance, a 
planet or a star in the universe, a stone falling in a ravine, a ship sailing in the 
ocean, etc.). Such a point preserves its identity over the course of time; hence, 
it is possible to define its trajectory through a function of time t — x(t), where 
x(t) = (x(t), y(t), z(t)) is the vector whose components are the coordinates of 
the point at time t, in the chosen reference frame (O; i,j, k). 

Mathematically, a point mass moving in the reference frame (O; i,j,k) 
observed as t varies over an interval J is represented as a curve C in R3 by 
the vector equations P(t) — O = x(t), t € I; and the parameter t has the 
interpretation of time (i.e., it is called “time”). 

Given a point mass moving as t varies in J, one can associate with it its 
“velocity” at time t € I. Operationally, velocity is defined by fixing to € J, 
finding the positions P(to) and P(to + £), and setting 

TRI Aee (1.2.1) 
where the parameter £ > 0 is to be chosen “suitably small” (according to well- 
defined criteria which, however, depend on the concrete cases). The mathe- 
matical model defines the point mass velocity at time tọ € I as the derivative 
of the function t > x(t) at t = to. 

To complete the mathematical model of a point mass, it is important to 
define the “force” acting on it. 

Operationally, the force acting at a given instant on the point mass con- 
sists of three scalar quantities which together define a vector f(t). The force 
acting on the point mass moving in R? and observed in the frame (O;i,j, k) 
is measured through a “dynamometer” which is an instrument whose use is 
convenient to describe in a strongly idealized form. It is, basically, a suitably 
built spring which will be imagined as a very thin, light segment with a hook. 

Consider a point mass moving in RÌ, with a velocity v = (vz, vy, Vz) 
relative to the reference frame (O;i,j,k) at time to. To measure the force 
acting upon it, hook it to the dynamometer to which the same velocity v has 
been imparted and which will be kept fixed during the measurement. Then 
try to adjust the spring length and direction so that the acceleration at time 
to +€ is 0, where € > 0 is chosen “suitably small”. (The empirical notion of 
acceleration and the corresponding mathematical model of it, as the second 
derivative with respect to t of the point position, is discussed along the same 
lines as the notion of velocity.) 

The force is then the vector f whose direction is that of the dynamometer 
at time to + £, whose orientation is that parallel to hook but pointing away 
from it and whose modulus is the size of the spring elongation. 

Summarizing: a point mass subject to forces and observed in a frame 
(O;i,j,k) in R as time varies within an interval J is, in its mathematical 


1.3 Example of a Model 5 


model, described by a curve in seven-dimensional space: one of its points 
(t, £,Y, Z, fe, fy, fz) represents a point mass which at time t has coordinates 
(x,y,z) in (O;i,j,k) and, in the same frame, is subject to a force (fx, fy, fz): 
The curve representing this situation can be parameterized by the parameter 
t itself, as t varies in some time interval J; it shall also be assumed that in 
this parametric representation the functions t > (a(t), y(t), z(t)) are twice 
continuously differentiable so that a mathematical definition of velocity and 
acceleration is meaningful. 


1.3 The Laws of Mechanics 


Once it is established what is meant by a point mass subject to forces and 
studied in a given frame of reference in R3 as the time varies in an interval 
I (briefly, “a point mass subject to forces”), it is possible to complete the 
mathematical model of the point mechanics. For this purpose, the “laws of 
dynamics” and their mathematical interpretation have to be discussed. 

Experimentally, given a point mass, a simple relation is observed between 
its acceleration a at time t (in a given frame of reference) and the force f acting 
on it at that time (observed in the same frame). Such a relation is called the 
Second Law of Mechanics and establishes the existence of a constant m > 0, 
characteristic of the point mass and independent of the frame of reference 
used for the observations, such that: 


ma =f. (1.3.1) 


This law introduces, via the properties of the differential equations, many 
relations among the quantities x,v,t, and such relations can sometimes be 
experimentally checked. For instance, if it is known a priori which force will 
act on the point mass whenever it is at the point (x, y, z) at time t with velocity 
(Uz, Uy, Uz), then, denoting such force as f (vg, vy, vz, £, Y, Z, t) = f(v, x, t), the 
differential equation 


mx = f(x,x,t) (1.3.2) 


allows the determination of the motion following an initial state, in which the 
velocity vo and the position xo are given at time to, at least for a small time 
interval around to if f is a smooth function, see Chapter 2. 

The First Principle of Mechanics postulates the existence of at least one 
reference frame (O;i,j,k), called “inertial frame”, in R? where a point mass 
“far” from the other objects in the universe appears to be subjected to a null 
force in (O; i,j,k). Such a frame is experimentally identified with a frame with 
origin in a fixed star and with axes oriented towards three more fixed stars. 
It is to such a frame that motion is often referred. 

Of course the notions of “far” and of “fixed star” are empirical notions 
rather than mathematical ones. 
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Mathematically, the first principle is used to grant to a particular frame 
of reference in the space-time continuum a privileged role and to define the 
“absolute force” or the “true force” as that acting on the point mass in this 
frame. This frame has to be chosen once and for all and is called the “fixed 
reference frame” (as opposed to “moving reference frame” ). 

It is possible and sometimes convenient to introduce frames whose ori- 
gin and axes vary with time with respect to the “fixed” frame (O;i,j,k) : 
(0(0);i(r),5(8), KE). 

Since f = ma, it follows that if the moving frame is in uniform rectilinear 
translational motion with respect to the fixed frame, then the force acting 
upon the point is the same whether observed in the fixed frame or in the 
moving frame: hence, in this moving frame, the “inertia principle”, i.e., the 
first principle, is valid: a point mass which is “very far” from the other objects 
in the universe is subject to a null force, since the acceleration is the same 
in the two frames. All frames in rectilinear uniform motion with respect to a 
fixed frame are called “inertial frames”. 

The mathematical model of a point mass with mass m subject to forces 
and obeying the laws of dynamics is then, simply, a point mass subject to 
forces, in the sense of the preceding section, and such that the relation 


ma=f (1.3.3) 


holds and, furthermore, f is a function of the point velocity, position, and 
time; i.e., the following relation holds: 


f = f(v,x,t). (1.3.4) 


Clearly, from such a mathematical viewpoint (where f is imagined as given a 
priori), the first principle is deprived of its deep physical meaning. 

An important extension of the point mass model is a model for the me- 
chanics of a “system of N point masses”. Mathematically, such a system con- 
sists of N point masses with mass m1,..., my, in the above sense, satisfying 
the Third Principle of Mechanics. This means that it should be possible to 
represent the force f; acting on the i-th point as 


f=) fji, (1.3.5) 
j#t 

where fj—; are such that 
(a) fji = -fij ji = 1,2, .., N, 04 j; 
(b) fj: is parallel to P; — Pj, i.e., to the line joining the positions P; and P; 
of the i-th and j-th points; 
(c) fj—i depends solely upon the positions and velocities of the it-h and j-th 
points and on time: 


fisi = fji (vj, Vi, Pi, Pi, t). (1.3.6) 
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This assumption corresponds to a precise empirical fact: it is possible to define 
operationally what should be understood by fj—; “the force exerted by the 
point P; on the point P;”. 

For instance, the force f;_,; could be measured as follows: one measures, 
in the given inertial frame of reference, the force f;, acting on 7 and then one 
measures, after removing the point j from the system, the new force acting 
on the i-th point, obtaining the result £0 ). then one sets 


faf fO. (1.3.7) 


The Third Principle of Mechanics arises from the experimental observation 
that fji = —fi—;, that fj—; is parallel to Pj — P;, that the total force acting 
on a singe point mass is the sum of the forces exerted on it by the other 
system points (in the sense of vectors addition) if observed in an inertial 
frame of reference, and, finally, that fj—; depends only upon the positions 
and velocities of the points involved and, possibly, on time. 

Physics often places still more requirements and restrictions upon the laws 
of force which can be used to give a more detailed specification of a mechani- 
cal system model. However, they do not have a general character comparable 
to the three principles but, rather, are statements explaining which laws of 
force are to be considered a good model under given circumstances. For in- 
stance, two point masses “without structure” (this is, again, an empirical 
notion which we refrain from elucidating) attract each other with a force of 
intensity mm’/kr?, where r is the distance between the points, m and m’ are 
their masses, and k is a universal constant. If the structure of the two points 
can be summarized by saying that they have an “electric charge e” (a new em- 
pirical notion), the mutual force will be the vector sum of the above-described 
gravitational force and of a repulsive force with intensity k'e? /r?, where k’ is 
another universal constant. 

The principles of mechanics already place enough restrictions upon the na- 
ture of the forces admissible in mechanical problems: therefore it is convenient 
and interesting to examine their implications before passing to the analysis 
of special models obtained by concretely specifying the “force laws”, i.e., the 
functions giving the forces in terms of the points positions and velocities and 
of time. 

It should be stressed, and this is a general comment on the mathemati- 
cal models for physical phenomena, that the mathematical model is always 
“poorer” than the physical reality that it tries to imitate. For instance in the 
above mathematical model for mechanics, the first principle loses its meaning. 
Another example, implicit in the above discussion, is the following. 

To give an operational meaning to the notions of position, speed, force, etc., 
it must be possible to repeat “identical” experiments several times (e.g., see 
the position measurement in §1.1 by repeating the measurement operations. 
However, time inexorably flows away, and this is impossible. Physically, this 
difficulty is avoided by the “principle of homogeneity of space-time” which 
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says that experiments starting at any time in any space location will yield 
the same results if the points involved are in the same relative positions and 
situations. 

In the mathematical model for mechanics just described, the necessity of 
understanding the above problems does not arise, nor do many other similar 
problems which the reader will easily think of. 

Usually it is possible to complicate the models in order to imbue them with 
any given number of physical facts: but an analysis of this type of questions 
would lead us beyond the scope of this book. 

In any case, a decision is always needed on where to put a stop to the 
process of model improvement, which would otherwise hopelessly continue ad 
infinitum. We must recall that we have the more down-to-earth, and more 
interesting, problem of obtaining some concrete prediction algorithms for our 
observations of nature. 


1.4 General Thoughts on Models 


In this book more abstract schematization processes concerning empirically 
observed phenomena will be met (e.g., when we discuss the notion of an 
“observable” or of a “vibrating string”). In such cases, however, the details 
of the construction of the mathematical model will not be repeated: a very 
common practice based on the idea that the very words used to designate 
well-defined mathematical objects will implicitly define the model. 

It is such a practice, or better, its imperfect understanding, which some- 
times causes misunderstandings between physicists and mathematicians and 
provokes allegations of non-rigorous use of mathematics. 

It is important to realize that when the physicist speaks in mathematical 
terms he is by no means attributing to them the same rigid meaning that a 
mathematician would assume for them. Rather he is using this language to 
help himself in the formulation of a model which, once well defined, he shall 
rigorously treat (since he believes, or at least hopes, that the book of nature 
is written in mathematical characters). 

Possibly logically non rigorous steps or apparently wild mathematical ap- 
proximations in a physicist’s argument should always be interpreted as further 
complications or, better, refinements of the model that the physicist is trying 
to build. 

In the hectic development of research, a physicist often modifies a model 
while using it, or he modifies the mathematical meaning of the objects and 
entities which belong to the model without changing their names (otherwise, 
a dictionary would not suffice). He does this because his main interest is in 
the construction of models and only secondarily in its mathematical theory, 
often considered trivial for his needs. 

To avoid excessively pedantic discussions, we shall adhere, in the following, 
to the well-established practice of avoiding the physical analysis necessary to 
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the construction of a model and shall leave it to the reader to imagine such an 
analysis via the suggestive names used for the various mathematical entities 
(with the exception of a few important cases). In any case, this book is devoted 
to the mathematical, rather than physical aspects, of mechanical problems. 


Bibliographical Comment. It is very useful to study at least the defi- 
nition and the laws of motion in the Philosophiae Naturalis Principia Mathe- 
matica by I. Newton, [37], to understand exactly the Newtonian formulation 
of mechanics and its modernity. To avoid “reading too much”, i.e., to avoid 
interpreting these immortal pages in too modern a way, it is a good idea to 
read the paper Essays on the history of mechanics by C. Truesdell, pp. 85-137 
([48]). The reading of the first two chapters of the work by E. Mach, [31],) 
will be a very useful and stimulating complement to the first three chapters 
of this book. 


2 


Qualitative Aspects of One-Dimensional 
Motion 


2.1 Energy Conservation 


Consider a point mass, with mass m, on the line R and subject to a force law 
depending uniquely on its position. Therefore, a force law € — f(E) is, given 
E € R, which we shall suppose to be of class C™, associating with every point 
€ on the line R the component f(E) of the force acting on the point when it 
happens to occupy the position €. 

A “motion” of the point mass, observed as t varies in an interval J, is a 
function t — x(t), t € I, of class C% (I) such that 


met) = flet), Veet (2.1.1) 
The “energy conservation theorem” follows by multiplying Eq. (2.1.1), side 

by side, by «(t): 
mat = f(a), (2.1.2) 


omitting, as will often be done, the explicit mention of the t-dependence. 
Then, defining the functions, 


: £ 
noT omy, eo vOÀO T- S Ea, (213) 
it is 
d d 
ae) =miäï, a’) = —f(x)t (2.1.4) 
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so that Eq. (2.1.2) becomes 


d 

dt 

This implies a constant E can be associated with every motion t — 
x(t), t € I, depending on the motion under consideration and such that 


(T(&) + V(x)) =0 (2.1.5) 


T(z(t)) +V) =E, Vtel. (2.1.6) 


The expressions T(z) and V(x) are respectively called the “kinetic energy” 
and the “potential energy” and Eq. (2.1.6) has to be read as follows: “in every 
motion developing under the action of a force with potential energy V, the 
sum of the kinetic energy and potential energy is a constant”. This constant 
is given the name “total energy” of the considered motion. The “qualitative 
theory” of Eq. (2.1.1) is concerned with the analysis of the properties of the 
motion verifying Eq. (2.1.1), which are valid independently of the choice of 
f, at least for vast classes of functions f. The energy conservation is a first 
example of a qualitative property. 


Observations. The energy conservation goes back at least to Huygens; after- 
wards, it was used by J. and D. Bernoulli together with the law of conservation 
of linear momentum (Descartes) (see [48], p. 105 and following). 


Eq. (2.1.6) implies an expression for the velocity: 


2 


m 


1 
i(t) = +(—(E — V(2(t))))?, tel (2.1.7) 

This relation, which will be used and discussed in §2.6, allows the reduc- 
tion of the determination of the evolution law t > x(t), t € I, “time law’, 
to an area-computation problem for a planar figure, “quadrature”. In fact, 
supposing t > 0, it yields: 


7 a ee = 
20) ,/2(E—V¢(é)) 
when 7 D [0, t]. 


Hence, the area under the graph of the curve with equation € > T (£) = 
(4 (E — V(E)))~2 above the interval [(0),2(t)] is the time that the point 
needs to reach x(t), starting from (0) at time 0 with positive speed and 
energy E, at least for small t (i.e., as long as t > 0). 

Newton “reduced to quadratures” the simplest problems of motion without 
explicitly using energy conservation ([37], for instance Book I, Propositions 
XXXIX, XLI, LII, LVI, etc.). 


(2.1.8) 
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2.2 General Properties of Motion. Uniqueness 


In the preceding §2.1, a motion developing, under the action of a force f, ina 
time interval I was supposed to be given. We can ask which further properties 
of a particular motion allow us to select it from among all motions which, in 
the same time interval J, take place under the action of the same force. 

One can even preliminarily ask whether, given an interval J, there exist 
any motions, i.e., C% solutions of Eq. (2.2.1) thought of as an equation for 
t—a(t),te Tl. 

In view of the importance of such questions, before proceeding in the 
analysis of Eq. (2.1.1), some attention will be devoted to the general problem 
of the existence, uniqueness, and regularity of the solutions of differential 
equations in R4. 

Eq. (2.1.1), thought of as a “second-order” differential equation in Rt, is 
equivalent to a “first-order” equation in R?: it suffices to write it as 


rH =y, IE) = fæ), (2.2.1) 


where Eq. (2.2.1) is an equation for the unknown C® function t —> (x(t), y(t)) 
defined on I and with values in R?. 
More generally, consider an arbitrary “s-th order” differential equation in 
RI, s = 0,1,..., like 
d°x(t) d>—!x(t) dx(t) 
= f(—__.,..., ——,, x(t), t 222 
dts ( dts—1 , Y dt ,x( J ), ( ) 


with t € I, where f is an R?-valued C® function defined on R4 x R x R and 
t — x(t) is an unknown Rf-valued C% function on J. The latter equation 
may be thought of as a first-order equation in R? by setting 


(t dy (t); 


dx 
ee a ee 
dy(t) .— dy(t) ._ 
a =Ys-1; = = f(ys-i(t), wre yi(t), x(t), t) (2.2.3) 


and then considering Eq. (2.2.3) as an equation for the C% function t > 
(x(t), y1(t),-.--yYs—1(t)) defined on the interval I and with values in R? x 
eX RESR 

Eq. (2.2.2) is the most general differential equation that will be met in this 
book. By virtue of the preceding remark, it will then suffice, for our purposes, 
to study first-order differential equations in R? having the form 


x(t) =F(x(t),t), tel, (2.2.4) 


It will be convenient to introduce a precise convention about what a dif- 
ferential equation is or about what one of its solutions is. 
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1 Definition. Given an R¢-valued function F € C® (RI x R), the expression 
(2.2.4), denoted, for short, x = F(x, t), will be called a “differential equation 
on R? associated with F”. 

A “C™ solution’, k > 1, of Eq. (2.2.4) on the interval I, closed or open or 
semi open, will be a C*) function which turns Eq. (2.2.4) into an identity 
when substituted into it.! A “solution” of Eq. (2.2.4) fort € I is a C% 
solution. The solutions of Eq. (2.2.4) will often be called “motions”. 


Let us first examine the uniqueness problem for the solutions of Eq. (2.2.4). 


1 Proposition. Let (€,t) — F(€,t) be an R4-valued C® function on RIZR. 
Given a > 0,b > 0,to E€ R, let t > x(t) be aC™ solution of Eq. (2.2.4) on 
5 [to — a, to +b]: 

(i) the function t > x(t) is in C® (J); 

(ii) if t > y(t) is another solution of Eq. (2.2.4) on J and if y(to) = x(to), 
then x(t) = y(t), Vt EJ. 


Observations. 


(1) This proposition applied to Eq. (2.2.2) via Eq. (2.2.3) tells us that two 
C(s) solutions of an s-th order differential equation in R4 for t € J coincide 
if and only if at time tọ € J (“initial time”) they have the same first (s — 1) 
derivatives (“equal initial data”). When Eq. (2.2.2) is the equation governing 
a physical motion in R4, it is s = 2; this means that the motion is uniquely 
determined, if existing at all, by its initial position x(to) and by its initial 
velocity x(to), i.e., as one says, by its initial “act of motion” x(to). 

(2) It would appear that it might be interesting or important to know if, 
by specifying properties of the solutions of Eq. (2.2.2) other than the just- 
mentioned initial data at some initial time, the solution verifying such prop- 
erties is uniquely determined 2, if existing at all. The uniqueness criterion 
that we chose above for illustration purposes, Proposition 1, has been se- 
lected only because it quickly leads to a simple answer and because it is one 
of the uniqueness criteria which are most useful in many applications. 

(3) From the proof it will appear that if F had been only supposed to be of 
class C (k), k > 1, then uniqueness would have followed in an equal way. The 
regularity of t — x(t),t € J, could also be deduced in this case, but one would 
only obtain that t > x(t) is a C+ function. 


PROOF. By integrating both sides of Eq. (2.2.4) and by setting xo = x(to) = 
y(to), we get: r 
x(t) = Xo +f F(x(T), 7) dr, te J, (2.2.5) 


to 


1 We shall see that every CCF) solution, k > 1, is automatically a C% solution, if F € C. 

2 For instance, we can ask the following question. Consider Eq. (2.2.2) with s = 2 and 
lei t1,t2 be two times and x1,x2 E€ RI be two positions. Is the motion [solution of Eq. 
(2.2.2)] leading from xı to x2 as time elapses from tı to t2 (assuming that one such 
motion, at least, exists) unique? We shall see that the answer to this question will, in 
general, be no. 
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and, similarly, since also t — y(t) is a solution of Eq. (2.2.4): 


y(t) = xo +f F(y(r),7) dr, te J. (2.2.6) 
Hence, P 
x(t) — y(t) = i (F(x(r),7) — F(y(7),7)) dr. (2.2.7) 


To prove (ii) the procedure that will be followed is very interesting since 
it obviously goes beyond the particular result that we wish to obtain. 

Informally, the argument is the following: the difference |x(t) — y(t)| is, 
by Eq. (2.2.7), about |t — to| |F(x(t),t) — F(y(t),t)|, if t ~ to; however, the 
increment |F(x(t),t) — F(y(t),t)| is proportional, by Lagrange’s theorem, to 
the increment of the argument of F, i.e., to C'|x(t) — y(t)|, where C is an 
estimate of the first derivatives of F. Hence, Eq. (2.2.7) implies that |x(t) — 
y(t)| and C|t — to| |x(t) — y(t)| are about equal if t ~ to, and this, in turn, 
implies that |x(t) — y(t)| = 0 for t dose to tọ because for t ~ to, one has 

To estimate the integrand of Eq. (2.2.7) let S C R? be a sphere with so 
large a radius that it contains all the values x(T), y(7), V7 € J, and let 


d 
OF 
K o 2 | 0; | (2.2.8) 


where F)(€,t) is the i-th component of the vector F(€,t) = (F(€,t),..., 
FO (€,t)) € RI. Then, from Taylor’s formula: 


|F(x(7), 7) — F(y(7),7)| < Ms |x(r) — y(7)l. (2.2.9) 
Inserting this inequality into Eq. (2.2.7), yields 


x(t) -y< Ms f x) —¥(e)] ar (2.2.10) 


Let M(t) = maxt<r<t |X(T) — y(T)|, t € [to.to + b]; then Eq. (2.2.10) implies 
x(t) — y(t)| < Ms M(t) |t— tol, Vt € lto, to +b]. 

Since M(t) is monotonic nondecreasing and since this inequality holds for 
all t € [to, to + b], one easily finds that 


M(t) < Ms|t—to| M(t), Vt E [to, to + 4] (2.2.11) 
which implies M(t) = 0 for |t — to| < Mg',t € [to, to + b]. 

Hence, x(to + M3 ) = y(to+M5'), ifto+ Mg! < to +b, and the argument 
can be repeated, replacing tg by to + Ms", to show that M(t) = 0 for t € 
[to, to + 2Mg'] if to +2M5' < to +b, etc., so that M(t) = 0 for t € [to, to +b]. 
For t € [to — a, to], one proceeds likewise.? 


3 Alternatively, Eq. (2.2.10) could be iterated n times to yield, if u = max|x(r) — y(r)|, 
T € [to — a, to + b]: 


16 2 Qualitative Aspects of One-Dimensional Motion 


To check (i), i.e., that t > x(t) is a C% function on J, remark that 
if t > x(t) is a C (J) function, then Eq. (2.2.4) implies that t —> x(t) 
is in CY (J), being a composition of a C% function with a C function; 
furthermore, by differentiating Eq. (2.2.4): 


x(t) = oS OF ate), t) x® 4 aE t) (2.2.12) 


which, in turn, implies that t > X(t) is a C® function by the same argument 
as above. Then, by differentiating Eq. (2.2.12), one finds that `x’ (t) is a CO) 
function on J, etc. mbe 


2.2.1 Problems for §2.2 


1. If t — x(t),t > 0, solves x = f(x) and x(0) = x(T) for some T > 0, then x(t) = 
x(t+T),Vt > 0; assume f € C°(R7). Would this also be true if f € C!(R4)? (Hint: Use 
uniqueness). 


2. The property of the preceding problem is not valid when the differential equation right- 
hand side is explicitly time dependent (i.e., x = f (x, t), and Of /Ot Æ 0, the “non autonomous 
case”). Find an example. 


3. Let f(x, t) be such that f(€,t) = f(€,t+7) for some T > 0 and for all € € R. Suppose 
that t — x(t) is a solution of x = f(x,t) such that for some integer m > 0, one has 
x(0) = x(mT), then x(t) = x(t + mT), Vt > 0. (Hint: Use uniqueness.) 


A. Consider the equation (t) = &(t) a(t) with £ € C™(R). Show that if t — x(t) and 
t — y(t) are two solutions for t € J and if x(t) Æ 0, there exists a constant A such that 
y(t) = Axr(t), Yt EJ. 


5. If the function £ of the Problem 4 is periodic with period T > 0 and t > x(t) Æ 0, 
is one of its solutions then also t — a(t + T) is a solution. Hence, 3A # 0 such that 
x(t +T) = x(t). Show that A > 0. (Hint: Otherwise either A = 0 and 2(T) = 0, hence 
x(t) = 0 (by uniqueness on [0,-+00)), or A < 0 and there would be t € (0.T] where x(t) = 0: 
hence, again, x(t) = 0 by uniqueness.) 


6. The most general solution t > y(t), t E€ R+, of the equation in Problem 4, with £ periodic 
with period T has the form y(t) = A\*/T z(t), where z € C® (R+) is T-periodic. 

7.* Consider the equation x = L(t)x in R, where t > L(t), t € R, is a dx d-matrix valued 
C© function. Consider d solutions x) x for t € I = [a,b] and call them “inde- 
pendent” if Ito € I such that the d vectors x (tg),...,x( (to) are linearly independent. 
Show that, if t € I, then also x“) (t), ...,x(® (t) are linearly independent whenever they 
are such for t = to and, furthermore, any solution t — y(t), t € I, can be represented as 
y(t) = ey Ajx)(t), Vt € I. (Hint: If for t = Ẹ, the d vectors were not independent, 


ix - y®) <m f an f dry... f E E E E 
[to,t] [to,71] [to;Tr—1] 


< mgu f an f drs. oh dm = Mgu 
[to,t] [to,71] [to,tr—-1] 


so that x(t) — y(t) = 0 since n is arbitrary and it can be let to +oo. 


(a +b)” 


|t — to|” a 
n! < Msn n! 
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one could find constants A,..., Aq, not all equal to zero, such that Erai Ajx) €) =0 


hence, by linearity and uniqueness, D A;x0) (t) = 0, Yt € I which contradicts the 
independence for t = to.) 


8. Show that Problem 7 implies that, given d solutions t > x® (t),...,x(® (t), t € I, to 
x = L(t)x, the matrix W(t) (“Wronskian matrix” of x(®),...,x(®)) defined by 


has a determinant w(t) non vanishing for t € I if and only if Sto € J such that w(to) Æ 0. 
(Hint. By linear algebra, this is just another way of phrasing Problem 7: d vectors are 
linearly independent if and only if the “determinant of their components” is not zero.) 


9. Using the determinant differentiation rule, by rows, show that 


< w(t) = = £ aot W(t) = (Sa (t)) w(t 


t 
hence, if Sea Lij (t) = L(t), one has w(t) = w(to) glo rar 


10. In the context of Problem 8, suppose that the matrix function t — L(t), t € R, is 
periodic with period T > 0, i.e., t > Llij(t), i,j = 1,...,d are T—periodic functions. Let 
x@),...,x( be d linearly independent solutions for t > 0. Then there exist d? constants 
Ao), i,j =1,...,d, such that 


Vt+T) = yan De), t>0. 


Show that det W(T)/det W(0) = w(T)/w(0) = det A £0. 


11. Suppose that the matrix A is similar, via a real nonsingular matrix S, to a real diagonal 
matrix A, Aij = Ai fij, i,j =1,...,d: SAST! = A. In the context of Problem 10, define 


d 
DA = F Syx G) 
j=1 
Show that y®, ia wy are linearly independent solutions, A1,...,Aq Æ 0, and 
yYOE+T)=uMyYM@, £20 


12. Suppose that A is a matrix similar to a diagonal matrix A via a complex nonsingu- 
lar matrix S. Show that y), re yO), defined as in the preceding problem, are complex 
solutions of x = L(t)x and that yYO(t+ T) = y(t), Vt > 0. (For applications, recall 
that from linear algebra (see Appendix E), a sufficient condition for the similarity between 
A and a diagonal matrix Aij = A; ĝij is that the roots 1,...,Aq of the secular equation 
det(A — à) = 0 are pairwise different.) 


13. Given the assumptions of Problems 10,11 and supposing A1,...,Aq > 0, show that the 
most general solution to x = L(t)x has the form 
d . 
= So ajri/7 2) (t) 
j=l 
where the functions z), ...,2 are d C® functions periodic with period T, and aj,...,aq 


are arbitrary constants. (Hint: Let z®) (t) = aT yO) 
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14. Suppose that for every nonzero complex number A, there exists a C° function t > 
y(t), t E R, such that y(t + t) = y(t)y(t’), y(0) = 1, Y(T) = A7}, y(t) # OVE € R; then 
the conclusions of Problem 13 would hold, replacing A~*/T by y(t), without the assumption 
Aj > 0, j =11,...,d, under the only assumption det A Æ 0. See also the following problem. 


15. Let à € C, AT! = e(cos@ + isin0,o > 0, 6 € [0,27]. Define y(t) = o'/T (cos 40 + 


i sin +0). Show that (0) = 1,y(t)y(t +t’) = yt +t), (T) = AT, yt) #0, VEE R (e.g., 
(=1)t/T = cos Er + isin tr). 


Observations to Problems 8-15. 


We shall see that there always exist d linearly independent solutions to x = L(t)x. 
However, the existence of S is a restrictive condition. When such an S does not exist, it is 
possible to show that the most general solution to x = L(t)x, with L periodic with period 
T > 0 and C°, can be written in the form 


P 
LS ajr At Tt 2) (t), 


where Dji 6(j) = d, and 6(j), Aj are suitably chosen, and t > z)(t),t > 0, are C% 
functions periodic with period T and possibly complex valued (when Aj are not positive 


and at/T is interpreted as explained in Problem 15), and aj, are arbitrary constants (see 
[38], for instance, Vol. 1, pp. 63-68, ). 


16. Consider a differential equation % + a(t)t + b(t) = 0,t E€ R, a,b E€ C™(R). After 
reducing it to a first-order system of two differential equations in R?, interpret the results 
of Problems 7-15 in terms of its solutions. Show first that the matrix W (t) associated with 
this system is expressed in terms of two of its solutions t 3 a) (t) and t > x)(t) as 


sD s(t) 
wo = (Caw 2a 


) and w(t) = a(t)w(t). 


17.* Extend Problem 16 to the case of the sth-order differential equation in R: 


aa 


dts 


s—1 da 
;(t) —, tE R. 
+ Leal ) 


2.3 General Properties of Motion. Existence 


An existence problem for the solutions of Eq. (2.2.4), hence of Eq. (2.2.2), 
naturally associated with the uniqueness property given in Proposition 1, 
§2.2, is solved by the following proposition: 


2 Proposition. Let F be an R4-valued function in C® (RI x R). Let xo E€ RI 
and to E R. Let S(€o, 0) be the closed ball in R? with center €o and radius o. 
Let0 > 0. There exists To o > 0 and a solution of Eq. (2.2.4), i.e., x = E(x, t), 
defined for t € [to — To,9,to + Too] and of class C® such that: 


x(to) =€0, x(t) € S(€0,0), tE [to — To,o, to + Tool. (2.3.1) 
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Furthermore, if one defines: 


def = 
Mo.éo,to,0 = X |F(é,¢)| = M (2.3.2) 
telto—8.to +0] 
one can choose 
Q 
T, o = ——— 0. 2.3.3 
o? 0 +0M Gas) 


Observations. 


(1) By Proposition 1, §(2.2), it is enough to show the existence of a C® 
solution verifying Eq. (2.3.1). 

(2) The proof that follows is “constructive” in the sense that it provides a 
sequence t > x(")(t), t € [to — Too, to + Ty,9], of functions approximating (as 
n — oo) the solution and, at the same time, it provides an estimate of the 
approximation error defined as max |x(t) — x‘")(t)|, where the maximum is 
taken on the interval [to — Ty,9, to + To,0]- 

(3) It is often useful, in applications, not to follow the solution scheme pro- 
posed by the following proof of Proposition 2. It might, in fact, be more 
convenient to use ad hoc procedures based on the particular features of the 
F under analysis in a concrete case. Usually, with such procedures one finds 
much better error estimates than the ones following from general methods, 
where one cannot take into account some special properties of the equations 
(e.g., symmetry properties, Hamiltonian form, etc.). 

(4) To understand informally the bound on the magnitude of the interval of 
existence consider first that, during the proof, it appears necessary to have an 
a priori control of how far x(t) can travel away from the initial position £o. 
The continuity of F guarantees the boundedness of the maximum of |E (£, t)|, 
for, say, € € S(&, 0), t € [to — 0, to + 0]. It follows that during the whole 
time interval [to — To,6, to + To,9], the point x(t) stays inside S(&, @) because 
x(t) = F(x(t),t) and the right-hand side of this relation does not exceed M, 
Eq. (2.3.2): notice, in fact, that Tọ, has been chosen, just to achieve this 
effect, smaller than both 6 and 9M~! (i.e., Too = (07-1 + 07 'M)7! so that 
MT, < 0). 

(5) The interval [to — T,,9, to + To,o] is certainly not optimal, at least because 
the choice of the set S'(€o, 0) x [to — 8, to + 0], where the maximum of |F| is 
considered, was arbitrary. A better existence interval could be obtained using 
this arbitrariness and optimizing the result over the possible sets on which one 
takes the maximum. Also, once the existence of a solution verifying Proposi- 
tion 2 has been established, one could apply Proposition 2 and Proposition 1 
to the equation with initial datum x(to +79) at the initial time to +T,9, thus 
continuing it beyond T,,9. However one cannot hope, in general, for an infinite 
existence interval containing R+: this can be seen through counterexamples. 
The simplest among them is provided by the equation t = x”, x(0) = 1, in R. 
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Proor. Rather than studying C solutions of x = F(x, t) verifying the initial 
conditions (2.3.1), look for R4-valued C({to — Ty,6, to + To,0]) solutions of 
the equation: 
t 
x(t) = £o + I F(x(7), 7) dr. (2.3.4) 
to 
Every C (fto — Ty,6, to + To,6]) function verifying Eq. (2.3.4) is a C® 
solution to the original equation also verifying Eq. (2.3.1), and vice versa. 
For t € [to — To,9, to + T,9] define the sequence of R4-valued functions t > 


x(t), n =0,..., through the following recursive scheme: 
x(0) (t) = £0, 
t 
x(t) =g | FEO), 7) dr, 
7 to (2.3.5) 


x(t) =o + | FCD), rdr, 


to 
and remark that each such function is in C% (R) and it s natural to try taking 
the limit as n — +00. The existence, uniformly in t € [to — Tp,9, to + To,6], of 


lim x™ (t) = x(t) (2.3.6) 


should imply that the limit function will also be continuous. Existence and 
uniformity of the limit is obtained by rewriting it as 


xO A E) -xD (t)) (2.3.7) 
k=1 
and deducing that if 
uk = max |x (t) —x"*D (t), then (2.3.8) 


tE[to—To,6 ,to+To,6] 


XO uk < +00 (2.3.9) 
k=0 


This will mean that the series of Eq. (2.3.7) is uniformly convergent for t € 
[to — To,0, to + To,o]: hence, the same will hold for the limit of Eq. (2.3.6). 
To estimate u we can refer to Eq. (2.3.5) to obtain for k = 2,3,..., 


x) (t) — xD (t) = I ' (F(x) (7), T) — F(x@-2)(7),7)) dr (2.3.10) 


to 


Through Lagrange’s theorem in the form 
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|F(€,7) zi F(n,7)| < L B i n|, 


(2.3.11) 
VE, Bh € S(£0, 0), YT € [to — To,0, to + To,6] 
where 
d z 
OF 
oc eesteoe) De | 0&; (60) en) 
telto—Ty oto tTo,0] pjer 
Eqs. (2.3.10) and (2.3.11) imply: 
jx) (t) xD A L J JxP—D (7) — x72 (7)| dr (2.3.13) 
[to,t] 


Vk = 2,3,... provided we preliminarily check that for all k = 0,1,..., the 
functions t— — x ")(t), t € [to — To, to + To,6], take their values in S(Eo, o). 

This last property is proved inductively starting from Eq. (2.3.5): keeping 
in mind the choice of T, (chosen, as essentially stated in observation (4), 
just in such a way to make this property true) suppose, inductively, that 
jx (t) — €o| < 0, Vh = 0,...,k — 1; it is a property which holds for k = 1. 
To check that |x) (t) — £o| < o remark that Eqs. (2.3.5) and (2.3.3) give 


Ix (t) — 0] < I dr |E(x"— (7), 7)| < Mogo lt tol <@ (2.3.14) 
[to,t] 

Eq. (2.3.13), follows because Eq. (2.3.14) with k = 1 yields for t € [to — 

To,0, to + Too], 


|x (t) met (t) < L% | dr | drz... 
[to,t] [to,71] 
gk-1pk-1 (2.3.15) 


(1) ae 
x os dTk—1|x>" (Tk-2) — o| < E-D? 
since Too > |t — to|. Eq. (2.3.15) shows the convergence of the series of Eq. 
(2.3.9) and, therefore, the limit of Eq. (2.3.6) exists uniformly for t € [to — 
To,0, to + Too] and defines a function t —> x(t) on this interval with values in 
S(£o, 0). It satisfies Eq. (2.3.4) as it is seen by taking the n — oo limit in Eq. 
(2.3.5) and by using the uniformity of the limit of Eq. (2.3.6) to exchange the 
integration with the limit. mbe 


2.3.1 Problems 
1. Give a lower estimate for the magnitude of T, 9, the amplitude of the existence interval 


as in Proposition 2, for the following second-order equations, assuming x(0) = 0, (0) = 1 
or (0) = 1, (0) = 0 as initial data at to = 0: 
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LS, f=2r4+2°, pe ete, ï = x, x == —sinz. 


Also estimate SUP 9 6 To, o from below. (Hint: Reduce the equation to first order and then 
apply Proposition 1.) 
2. Solve the equation % = x with initial datum «(0) = 1, (0) = 0. 


3. Solve the equations < x?,& = cos g, ¢ = (cos x)? with initial datum z(0) = 1. 


4. Solve the equation t = x + y, y = —x + 2y with initial datum x(0) = 0, y(0) = 1. 


5. Using the “quadrature method”, solve the equation ë = 4(x3 — x), (0) = 0,% = V2 (see 
§2.1, final comment). 


6. As in Problem 5 for « = — (4z? + 6x? — 2), (0) = 0, #(0) = V2. 


7. Find two linearly independent solutions for the equation in Problem 4. 
8.* Compute w(t) for the equation in Problem 4 (see Problem 8, §(2.2). 


9.* Let t — L(t) be a d x d-matrix-valued C% function on R. Show that the equation 
x(t) = L(t)x(t) admits d linearly independent solutions defined for |t| < T with T small 
enough. (Hint: Let x be the solution with initial data «\) (0) = ĝi j i,j =1,...d. Then 
evaluate an existence interval for such initial data.) 


10.* Compute Tı, ı for the equation in Problem 9 when |to| < o and £o is arbitrary, 
£o = x(to); for the symbols, see Proposition 1. Show that |€o|Ti,1 can be taken to be 
independent of to and £o at a given ø > 0. Deduce from this that every solution to x = L(t)x 
can be extended to a solution defined for t € R. 


11. Let L be a d x d matrix and consider the equation x = Lx in Rt. Suppose that L has 
d pairwise distinct real eigenvalues (see Appendix E for the eigenvalue notion) A1,..., Aq. 
Let v)),...,v be the respective real linearly independent eigenvectors (see Appendix E). 
Show that the functions t > e*i#v are d linearly independent solutions. Show that any 
solution t — x(t) has the form 


d 
x(t) SN aje®tvO), with (a1,...,a¢) ERÊ. 
j=l 


2.4 General Properties of Motion. Regularity. 


In proving Proposition 2 it was found that C solutions of x = F(x, t), F € 
C™~(R4 x R), are necessarily C% solutions. This is the simplest regularity 
property shown by the solutions of such differential equations. Other regularity 
properties of the solutions will be now analyzed. 

In applications it often happens that the right-hand side of Eq. (2.2.4) 
depends on parameters a € R” and that, furthermore, it is important to 
know how the solutions change as the initial data o and the parameters œ 
vary in RI and R”, respectively. A first answer to this question is provided 
by the following proposition. 


3 Proposition. Let £,t,a — F(€,t,a) be a C®(RI x R x R™) function 
taking its values in R, and consider the equation 
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t 

x(t) = £o +f F(x(r), T, @o) dr (2.4.1) 
to 

as an equation for the continuous function t > x(t) parameterized by £o, to, œo 

ERI XR xR”. Given 0,0,a > 0 and (€,t,@) E R? x Rx R™, there exists 

T > 0 such that: 

(i) Eq. (2.4.1) admits a solution for every (£o, to, œo) close enough to (€,t, œ) 

such that |E —£0| < £, |t—to| < $, [@-— a| < a. Such solution will be denoted 

t > Stl&o; to, &0) and it is defined for t € [to — T, to + T]. 

(ii) The function S:(€o; to, ao), defined for 


= 0 
|é — £o| < £, lt —to| < x \@—ag| <a, |t- to] < T (2.4.2) 


takes its values inside the ball S(€; o) with center € and radius o and it is a 
C™® function of its arguments. 
(iii) The value T can be taken as: 


Q 


= 3(p4 6 max |IG,t, a)]) © aa 


where the maximum is considered on the set |€-€| < £, |t-1| < $,|a—-@| <a. 
Observations. 


(1) Eq. (2.4.1) is equivalent to 


x(t) = F(x(t), t, Qo), x(to) E £o (2.4.4) 


and, therefore, the above proposition provides a regularity theorem for the 
solutions of Eq. (2.4.4) as functions of the initial data, of the initial time, of 
time itself, and of the parameters œ on which F may possibly depend. The set 
(2.4.2) and the key estimate (2.4.3) should not be taken too seriously as they 
are not optimal: they merely show an example of the type of concreteness 
that can be attained in the formulation of a regularity criterion (see, also, 
observation 4, p. 19). 


(2) Let B = (A, Des , Bd+-m-+2) = ((€0)1, sang (E)a, (@o)1, sey (Q0)m; t, to) and 
x(t) = (xı(t), ea , La(t)) = Silo; to, Qo) 
= (Si (£o; to, @o)1,---, (St (£0; to, @o)a) 


Formal differentiation of Eq. (2.4.4) with respect to bii = 1,2,...,m+d, 
gives 


(2.4.5) 


d 
a. =>) ee ay +> OE a, e (2.4.6) 
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Ox(t) £o 
EA hai = a; (2.4.7) 
Analogous equations for the higher-order derivatives can also obtained. 


(3) From the proof that the above d(m + d) derivatives = do actually 


verify these equations. 

(4) The d(m+ d) equations (2.4.6) and (2.4.7) can be considered by imagining 
that x(t) is a known function [obtained by first solving Eq. (2.4.4)]. Then, for 
each i = 1,...,m-+d, Eq. (2.4.6) can be thought of as a system of d differential 
equations for the functions of t,t —> ot with initial data at to given by Eq. 
(2.4.7). Each such system can be solved by regarding it as an ordinary linear 
system of differential equations of the type (2.2.4) with suitable initial data. 


dx; (t) 


Actually this is a method to compute the derivatives which, as it will 
appear in several instances, turns out to be quite useful. It is also useful in 
numerical computations. 

(5) Similarly, equations for the t or to-derivatives follow from Eq. (2.4.4): 


a x(t) _ > DEO.) Dealt) 
dt ðto 2 dEn Oto” (2.4.9) 
Ox(t) 


(Fj, rato = -F (Go, to, ao) 


to which remarks (3) and (4) apply. 

(6) Had F been assumed to be a C®™ function, k > 1, on RI x Rx R™ 
one could still have obtained a regularity result: however, one could only 
show, with the same proof that follows, that the function (t, o0, œo, to) > 
S,(€0; to, œo) is a C™ function in the region of Eq. (2.4.2). 

(7) Proposition 3 also yields a regularity theorem for the solutions of higher- 
order differential equations, of the type considered in Eq. (2.2.2), via the 
reduction to first order described in Eq. (2.2.3). The explicit statement of the 
corresponding results is left as a problem for the reader. 


PROOF. This proof is essentially a repetition of the proof of Proposition 2, on 
the existence property. Here a sketch is provided, leaving to the reader the 
elaboration of the details, if he deems it necessary. 

The statement about the existence (and uniqueness) of the solutions 
t > S:(&o;to, ao) follows easily from Proposition 2: Proposition 2 also im- 
plies the estimate (2.4.3) for T which follows from Eq. (2.3.3), identifying the 
parameters 0,6 of Proposition 2 with 0/2, 0/2. 

First check that (£o, t, to, 20) > S:(€0;to, œo) is a C function on the set 
(2.4.2). Let B = (G1,..., 8m+a+2) be defined as in observation 2. As seen in 
§2.3, t > S:(€o; to, @o) can be thought of as being obtained via a limit of the 
functions t > x” (t, £o, to, œo) recursively defined for t € [to — T, to + T] by 
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x) (t, £o, to, a0) = £o 


(2.4.10) 
x") (t, £o, to, ao) = £o +f F(x) (r, £o, to, ao), T, @o) dr 


to 


for n = 1,2,3,.... The functions (t, £o, to, a9) > x™ (t, £o, to, œo) are [see Eq. 
(2.4.10)] C° functions of their arguments, Vn. Furthermore, differentiating 
Eq. (2.4.10) with respect 6;,1=1,...,m+d+ 2, it is: 


ax (t, £o, to,0) _ IEo 
ee (OB; 


Bx NG £o, to, ao) 


T $ pa x") (t, Eo, to, @o), T, a0) — BB, 
(2.4.11) 
a D x) (t, £o, to, @o), rao) So} dr 
+ F(x") (t, £o, to, ao), T, a)i- PlGo, to, 0), 
where the last two terms arise from the contributions from the integration 


; ; (n) F 
extremes. This relation between oe and Ox" On ae can be used to estimate 


on — a along the lines of proof of Proposition 2. By proceeding in 


the same way and remarking that Eq. (2.4.10), by the choice of T, implies 
Vt € [to — T, to + T] and Yn =0,1,..., 


Ix“ (t, £o, to, ao) T Eol < S, (2.4.12) 


it follows, from Eqs. (2.4.11) and (2.4.12), existence of two constants M, L 
[see Eqs. (2.3.12) and (2.3.15)] such that: 


Ox”) — 

|Z @, £0, to, a0)| < M (2.4.13) 
OB; 

fa) (n) Ox (n-1) ooo E 

| a (t, £o, to, @o) — ao , £0, to, @o)| Š Gan (2.4.14) 


hold for all (t, £o, to, œo) in the region of Eq. (2.4.2) and for all n = 1,2.... 
Then Eqs. (2.4.13) and (2.4.14) imply existence and uniformity, in region 
of Eq. (2.4.2), of the limit: 
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Ox”) 
pilt, €0,¢0,@0) = lim Di, (t, €0, to, œo) 
(2.4.15) 
Ox) 2 ox ôx”) 
—(t t ——(t t — ——(t t 
OB; ( , 0, 0, Qo) t2 OB; ( 760i 0, &0) OB; ( #608 0,0)), 
Yi = 1,2,...,m +d + 2. The above limit is, therefore, a continuous function 


in the region of Eq. (2.4.2). 

Since the limit limn— x”) (t, ĉo, to, œo) exists and equals S;(&0; to, ao), 
the uniformity of the limit in Eq. (2.4.15) guarantees permutability of limit 
and of 0/06; operations, thereby showing differentiability of S;( Bao; to, œo) in 
the region of Eq. (2.4.2). It also shows, en passant, via the consideration of the 
limit as n — œ of Eq. (2.4.11), the validity of the statements in observation 
3. 

An essentially identical argument can be developed to show that (t, £o, to, 
Qo) — Slo; to, œo) is in class C®), Yp > 1, in the region of Eq. (2.4.2). It 
will suffice to differentiate Eq. (2.4.11) suitably many times to obtain relations 
analogous to it for the higher derivatives; such relations will then be used to 
obtain estimates analogous to Eqs. (2.4.13) and (2.4.14). mbe 


2.4.1 Exercises and Problems 


1. Solve the equation % — 2 + ax = 0,a > 1, with initial data x(0) = zo, (0) = vo, by 
finding two solutions of the form t + Ae**. By taking the limit a — 1 find the solution, 
with the same initial data, to  — 2t + x = 0 (using Proposition 3). 


2. Show that the equations % = —ex,e > 0, and % = 0 have, for the same initial conditions, 
solutions z-(t) and xo(t) such that lime—o x-(t) = xo(t), Vt E R. However, show that this 
limit relation is not uniform in t € R, except for special initial data. 


3. Consider the equation x = F(x, t, œ) and suppose that F(0, t, œ) = 0. Then, given R > 0 
and fixed to = 0, show the existence of £ > 0,0 > 0,@ > 0, such that: 


(1 —«)|w| < |Siw| < (1+ 0)|w| 


having denoted Sw the solution to the equation with initial datum w at to = 0. (Hint: 
Apply Lagrange’s theorem to estimate |S;w — w| in terms of the maximum of |F| in a 
suitable set, and then, likewise, |S;w — S;0| = |S;w|, (as 5+0 = 0), for |t| < €,|w| < o: 
use the regularity theorem to bound the derivatives of t, w, œ — S;w; see observations 2-5 


to Proposition 3.) 
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The theory developed so far for the equation: 


x(t) = F(x(t), t), x(to) = £o, (2.5.1) 
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where F is an R4-valued function in C®(R4 x R), is a “local theory”; the 
existence theorem given in Proposition 2, §2.3, gives, in fact, a solution to 
Eq. (2.5.1) defined in a finite neighborhood of to. It is often necessary in 
applications to have “global solutions”, i.e., solutions to Eq. (2.5.1) defined in 
time intervals containing a neighborhood of Rt = [0, +00). In analyzing this 
problem, the following definition is useful. 


2 Definition. A solution t > S,(€0;to) of Eq. (2.5.1) defined fort € (a,b) 
is called “maximal” if there are no other solutions defined in open intervals 
properly containing (a,b). 
Two solutions of Eq. (2.5.1) defined in two open intervals I, and Iz coincide 
in I, N Ig (see Proposition 1, p. 14); if I2 D Ih, the second solution is said a 
“continuation” of the first. 


Observations. 


(1) A solution to Eq. (2.5.1) is, therefore, maximal if and only if it “cannot 
be continued”. 

(2) For every initial datum £o and every initial time to, there is a solution to 
Eq(2.5.1) which is maximal: the interval of definition of such a solution is the 
union of all open intervals on which it is possible to define a solution. 

(3) This maximality definition only involves open intervals; however, this no- 
tion would be the same even other types of intervals were allowed in the 
definition of maximality. To understand this, just use the existence theorem 
of §(2.3), p. 18, to continue solutions out of closed or half-closed intervals. 


The following proposition clarifies the above notion by showing that a 
solution of a differential equation can be non global in the future (or in the 
past) if and only if it “diverges in a finite time”. 


4 Proposition. Let t — S;(€o;to) be a maximal solution for Eq. (2.5.1) and 
(a,b) be the interval on which this solution is defined. If b < +00, it must be 


lim sup |S; (€0; to)| = +00; (2.5.2) 
t—b-— 
if a > —oo, it must be 
lim sup |S; (£0; to)| = +00. (2.5.3) 
tat 


PROOF. Assume b < +00 and that Eq. (2.5.2) does not hold. Then there exists 
K < +00 such that 


ISt(So;to)| < K, Vite [to,d) (2.5.4) 


Using Proposition 2, we can find for every T € [to, b) a solution to the equation: 


x= F(x(t), t), x(T) = S,(€0; to) (2.5.5) 
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defined for t € [r —T,,,7 + Tı], where Ti, by Eq. (2.3.3) with o = 0 = 1, can 
be chosen as 


=f. = 

T,=(1+ max |F(é,t)|) > (1+ max | |F(E,d)|) =Ti) (2.5.6) 
[€-x(1)|<1 igi<1+K 

The solution under investigation can therefore be extended to a solution 


defined for 


t E Ure (to,b) (r -Tı,T +71), (2.5.7) 


manifestly contradicting the supposed maximality of (a,b). A similar argu- 
ment holds if a > —oo. mbe 


Considering Proposition 4, it is convenient to introduce the following def- 
inition. 

3 Definition. Consider the differential equation x = F(x, t) with F being 
an R4-valued C® (R x R) function. Suppose that there is an R+-valued con- 
tinuous function defined on RÌ: (r,s,t) > u(r, s,t) such that if t > S:(£o; to) 
is a solution to Eq. (2.5.1) defined for t € (a,b) then: 


|S:(€0; to)| < u(r, to, t), V [ol < r, to < t. (2.5.8) 


The differential equation is said “normal” in the future if u can be chosen to 
be (r, t)-independent. If u is bounded as t => +00 the equation will be said to 
have “bounded trajectories” in the future. 


Observations. 


(1) Eq. (2.5.8) is a strong condition on the motions generated by x = F(x, t); 
because of its independence on the existence interval(a, b), it is often called 
an “a priori estimate” on the motions governed by x = F(x, t). 

(2) An equation of higher order, like Eq. (2.2.2), will be called normal, or with 
bounded trajectories, if once reduced to a first-order equation it becomes 
a normal equation, or an equation with bounded trajectories, in the sense 
just introduced. More concretely, this means that it is possible to give an a 
priori estimate (i.e., independent of the interval of definition) of the sizes of 

dé—1 


x(t), x(t),..., “q (t) in terms of the observation time t, of the initial time 


to, and of the initial data x(to), X(to),..-, # == (to); furthermore, the bound 
depends continuously on those parameters. 


The importance of the definition is manifest in the following proposition. 


5 Proposition. If the differential equation (2.5.1), x = F(x,t), is normal, 
then it admits a “global solution”, i.e., a solution defined in a neighborhood of 
lto, too), for any given initial datum £o and initial time to. 
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PROOF. Let (a,b) be a maximal existence interval for a solution to Eq. (2.5.1), 
and suppose that b < +00. Then by Definition 3: 


lim sup |S;(&0; to)| < u(l£ol, to, b) (2.5.9) 
t—b— 


would hold, contradicting Proposition 4, Eq. (2.5.2). mbe 


An example of a normal equation (which also has bounded trajectories) is 
provided by the following proposition. 


6 Proposition. Consider the differential equation mi = f(x) in R, [see Eq. 
(2.1.1)], describing the motions of a point with mass m > 0, on a line and 
subject to a force depending only on the position, f € C@(R). Suppose that 
the potential energy V, see Eq. (2.1.3), is bounded below. Then the differential 
equation is normal. If lime—++oo V(x) = +00, the differential equation also has 
bounded trajectories. 


PROOF. If t — a(t) is a solution to Eq. (2.1.1), with x(to) = £o, (to) = no 
and defined for t € (a,b), by energy conservation (see §(2.1): 


Ima? +V (z(t) =E= Imi +V(E), Vte (a,b), (2.5.10) 


and therefore, if M = infeer V (£): 


2 


nje 


BOI = yŽE- vew) < (E-m) (2.5.11) 
and M > —oo, by assumption. Furthermore, 
E 2 1 
O= o+ | ar)dr| < lol + (E-M) 2.5.12) 


which, calling u(|£o], to, t) the right-hand side of Eq. (2.5.12), yields an a priori 
estimate, showing normality. 

If lime.ico V(E) = +00, let € — W(€) be a symmetric (i.e, W (€) = 
W (—&)) continuous function which is strictly increasing for € > 0 and which is 
a lower bound to V (£) : V(€) > W(€), VE € R, and such that lime... W (£) = 
+oo. Since V is supposed bounded below, such a function does exist. 

Let u(E) be the positive solution to W (£) = E, existing for all E > M, i.e. 
for all E’s of the form (2.5.10). Then the motion with energy E given by by 
th right-hand side of Eq. (2.5.10) must verify |x(t)| < w(£), as |a(t)| > a(E) 
would imply, by the left-hand side of Eq. (2.5.10) and by the choice of W, 
that 4ma(t)? < 0. 

By the assumed continuity and strict monotonicity of W, the function 
(E) is continuous in E for EF > M; hence, (to, t)-independent a priori bound 
|x(t)| < u(E) has been obtained. mbe 
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This section will be concluded by the following remark “a priori estimates”. 
In applications one often meets functions (x, t) > F(x, t) which are C% func- 
tions for (x, t) € (R4/A) x R), where A is a “singularity set” usually consisting 
of points, lines, surfaces, or even in a set with interior points; inside A x R 
the F might be undefined. In such cases the singularity of F means that the 
model originating the differential equations (2.5.1) is not a good model of the 
physical phenomenon that it hopes to describe, at least if the initial data or 
the motion generated by them enter the region A. 

For instance, the attractive force exerted by the Sun on the Earth is well 
described by the formula k|x|~? only if the distance between the Earth and 
the Sun is large compared to the Sun diameter; it is clear that the singularity 
in x = 0 is purely fictitious and due to an excessive idealization! 

In such cases one is free to modify F by changing it into a function F) € 
C®(RI x R) which, outside a small neighborhood of A x R, coincides with 
F. The equation 


x = FM) (x,t) (2.5.13) 


will then be an equally good model of the same physical phenomenon. 

However, it is obvious that the only interesting motions, among those 
described by Eq. (2.5.13), will be those evolving outside a neighborhood of 
Ax R, where, in fact, F and F™) are indistinguishable. 

In this book equations of the form (2.5.1) with F singular in some region 
will occasionally be considered. However, in all those cases it will also be 
possible to establish an “a priori estimate” guaranteeing the existence of a 
continuous positive function u’ on R x R x R such that if S;(€o;to) is a 
solution to Eq. (2.5.13), defined for t € (a,b) with initial datum at to given 
by ĉo € {set of initial data “thought of as interesting”} = A, then 


U(S¢(0; to), A) = u’ (Eo, t, to) (2.5.14) 


where d(&, A) = (distance of € from A) and u’ is positive for £o € A. Usually 
one shall fix A = A° = (complement of A) by possibly enlarging the set A. 
By what has been said so far, it appears that if we are interested only in 
motions starting outside A and A = A‘, we shall imagine that such motions 
verify Eq. (2.5.13) and, therefore, we shall be able to apply to them the various 
results concerning the differential equations with right-hand side in C°. 


The above elucubrations motivate the following definition: 


4 Definition. Let (€,t) — F(€,t) be a C® function defined on (R4/A) x R 
with values in RÌ, where A C RÌ is a closed set. Suppose that: 

(i) there exists an R%-valued function FA) € C®(RI x R x RI) coinciding 
with F on (R4/A) x R; 

(ii) there exists a real valued function u on R? x R x R, continuous and 
positive valued on (R4/A) x R x R, such that t — x(t) is a motion verifying 
X(to) = £0, x(t) = F(x(t),t), Vt € (a,b), then Vt > to and t € (a,b): 
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d(S;(&o, to), A) > w (£o, t, to) > 0; (2.5.15) 


(iii) the differential equation x = F) (x,t) is normal. 

In such a situation we shall say that the “singular differential equation” 
x = F(x,t) is “normal outside A”. 

It is an exercise to prove the following proposition. 


7 Proposition. Let (€,t) — F(€,t) be an R¢-valued C® function on 
(RIJA) x R and consider the singular differential equation x = F(x,t): if 
this equation is normal outside A, every initial datum £o Z A originates a 
C™® solution of 


<= F(x); x()= eo; x@) 2A (2.5.16) 
defined in a neighborhood of |to, +00), i.e., a global solution. 


Observation. As the reader will verify when looking at Chapter 4, §4.8, §4.9, an 
interesting example of the situation contemplated in Definition 4 and Propo- 
sition 7 can be found in the two-body problem: the set A will be, in this case, 
the closure of a neighborhood of the set of the initial data with vanishing areal 
velocity. Such data are those in which the two bodies are heading into or out 
of a collision and which are, therefore, to be considered singular. 


2.5.1 Exercises and Problems 


1. Formulate the notions of normal differential equation “in the past” or of differential 
equation with bounded trajectories “in the past”, and reformulate all the propositions of 
§2.5 to deal with the problem of the existence of solutions in intervals like t € (—oo, to] or 
t E€ (—o0, +00). 


2. Consider the equation in R, ž + 4 log(1 +x?) = 0. Determine whether it is normal and 
with bounded trajectories. Compute x(1) with a 60% approximation if x(0) = 0, (0) = 1. 


3. Same as Problem 2 for ë + sin x = 0, x(0) = 0, (0) Ł. 


4. Same as Problems 2 and 3 for the differential equations in Problems 1 and 2 of §2.3. 


5.* Same as Problem 2 but with a 1% approximation and using a desk computer together 
with the error estimate implicit in the existence theorem of §2.3. Alternatively, use the 
algorithm of Appendix O, together with a desk computer. 


6.* Same as Problem 5 but using energy conservation and the relative quadrature formula, 
together with a desk computer. 


7.* Same as Problem 6 but for the equation in Problem 3. 


8. Let t — x(t) be an R4-valued C% (R) function such that IM > 0 for which 


t 
Ix(t)| < [x()| + M f Ix(r)ldr, t20. 


Show that |æ(t)| < y(t), t > 0, where y is defined as the solution of y(t) = |x(0)| + 
M få y(r) dr > 0, i.e., y(t) = |x(0)]e™*. 
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9.* If € — y(Ẹ) is a continuous positive monotonically increasing function of £ € R* and if 
t > x(t) is in C® (R) and 


t 
Ix(v)] < Ix(0)| + f vix(r)I)dr, t20 


show that |x(t)| < y(t), t > 0, where y is defined as the solution of y(t) = |x(0)| + 
I p(y(T)) dr, i.e., setting #(y) = Sko ew, as the function verifying @(y(t)) = t (or 
y(t) = 871 (¢)). 


10.* Given the equation x = f(x, t) in R4, define for T > 0: yr(s) def max telo,r] IECE, t| 
l€l<s 
Show that a sufficient condition for the normality of the equation is that, in R, 


y=¢rty),  y(0) = |x(0)| 


admits a global solution (i.e., a solution on [0,+00)) for all T > 0. (Hint: x(t) = x(0) + 
AEs f(x(r), 7) dr => |x(t)| < |x(0)| + f yr (|x(r)|) dr; then apply Problem 9.) 


11.* If t — L(t) is a matrix-valued C™(R) function with values in the d x d matrices, the 
equation x = L(t)x is normal in the future (as well as in the past); hence, it has global 
solutions (Hint: Apply Problem 10.) 


12. In the context of Problem 11, show that the equation admits d linearly independent 
global solutions (defined on (—oo, +00)). (Hint: Use Problem 11 and Problem 9 of §2.3). 


13. In the context of Problem 11, suppose that L(t) is a time-independent matrix L. Using 
the results of Problem 11 of §2.3, p. 22, and supposing that all the eigenvalues of L are real 
and pairwise distinct, show that the equation x = Lx has bounded trajectories if and only 
if Aå; < 0,7 =1,...,d. 


14.* In the context of Problem 11, let g E€ C®(R) be an R4-valued function. Show the 
normality of the equation x = L(t)x + g(t). 

15.* Consider a differential equation in R4, x = F(x, t), with F € C®(RĊ x R). Suppose 
that |F(x, t)| < y(t) |x| + G(t), where B,y E C™(R), B,y => 0. Show that the equation is 
normal by finding an a priori estimate. (Hint: Combine Problems 9 and 4.) 

16.* Same as Problem 5 with |F(x,t)| < 6(¢) + y(t) log(e + |x|). 

17. Consider a differential equation x = f(x,t,a) of the type considered in §2.4, fe 
C™@(R4 x R x R™). Suppose that this equation admits an a priori estimate like Eq. 
(2.5.8), for Va E€ R”, with an aindependent function. Show that, in this case, the 
“local regularity theorem”, Proposition 3, p. 22, becomes “global”, i.e., the function 
(t, £0, to, a0) > St(€0;to,a0) is a C%-function of ĉo E€ RI, ao E R”, to E Rt € R, 
t > to. 


2.6 More on Differential Equations. Autonomous 
Equations 


Before proceeding in the analysis of some applications, it is convenient to set 
up a few more definitions, mainly as an excuse to illustrate some simple but 
interesting general remarks about differential equations. 
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5 Definition. Let (€,€,...,€6—)) — £(€,€,...,€°-Y) be an RI valued 
O” function on R®¢. Consider the equation for the R4-valued function t > 
x(t) defined for t in an interval I [see Eq. (2.2.2)]: 


dex dx dix 
re F(X, a 2.6.1 
TX = (x, &, N (2.6.1 
dx d5Ttx £ 
x(to) = £o, PAGO SE it eT) sgn, (2.6.2) 


Eq. (2.6.1) will be called an “autonomous” differential equation of class C®. 
In other words, Eq. (2.2.2) is said to be autonomous when the right-hand side 
“does not explicitly depend upon time”. 

The space R? x ...x RI =R"4, thought of as the space of the possible initial 
data (€,€,...,€0-)) for Eq. (2.6.1), will be called the “space of the initial 
data” or the “data space”. 


It is also useful to introduce the following definition. 


6 Definition. We shall say that a C® autonomous differential equation like 
Eq. (2.6.1) is “reversible” if any solution, t > x(t), to Eq. (2.6.1) defined for 
t € (—€,€), € > 0, is such that the function t > x(-t), t € (—é,¢), is also a 
solution to Eq. (2.6.1). 


Observation. We shall see that many differential equations describing non 
dissipative dynamical systems are reversible. Basically, f originates a reversible 
system when s is even and f depends evenly on the odd derivatives. It should 
be kept in mind that t — x(—t) will in general be a solution which corresponds 
to different initial data (unless s = 1): for instance # = x is an equation in R? 
which is reversible, but its solution x(t) = e* has initial data x(0) = 1,4(0) = 1 
while the solution t > e~* has initial data (0) = 1, ¿(0) = —1. 


The interest in autonomous equations lies, from a mathematical point of 
view, in the validity of the following easy propositions. 


8 Proposition. Consider a normal autonomous first-order.’ differential equa- 
tion in RÌ. It is possible to define on R? a family (S:)i>0 of maps, mapping 
R? into itself, such that the functions 

t > Stt (£o), Eo E RË, t to ER, t> to (2.6.3) 
solve Eq. (2.6.1) with initial datum at t = to given by £o. For every t > 0, the 
map St is a C° map and 

Si(Sv(€)) = Seel), Vtt >0, YEER. (2.6.4) 


Furthermore, the maps S; are C® regular jointly in to and t: i.e., the functions 
(t, £0) > Sil£o), (t, £0) € Ra x RI are in C® (R4 X RÌ). 


4 Le., s = 1 in Eq. (2.6.1) 
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PROOF. Let t > S:(&0;to) be the solution to Eqs. (2.6.1) and (2.6.2) with 
s = 1, defined for t > to. Such a solution does exist since Eq. (2.6.1) is now 
supposed to be a normal equation. Let: 


St(o) = S:(€0;0) = for ¢ > 0 (2.6.5) 


From §2.5, S; isa C% map of R? into itself for each t € R4 and, also, that 
(t,€0) — Silto), (t,€0) E€ R} x R? is in C®(R} x RÌ). For t > to, let 
x(t) = St—to (£o). Since f “does not explicitly depend on time”, it is 


d d d 
Z(t) = F Sito (Go) = F Stt (Eo, 0) 


= f(St—to (£0, 0)) = £ (St-to (£0)) = f(x(¢)) 
Hence t — St—to (£0) is a solution to Eq. (2.6.1) for t > to. Furthermore, 


(2.6.6) 


Sto—to (0) = Solo) = So (£0; 0) = £o (2.6.7) 


which, by the uniqueness theorem, Proposition 1, p.14, gives S:_1,(€) = 
St(€o, to), t > to Similarly, one checks that t —> S;(S%(&o)) is a solution to Eq. 
(2.6.1) with initial datum at t = 0 equal to Sy (£o); such is also > S144 (£o); 
hence, Eq. (2.6.4) is also proved. mbe 


9 Corollary. Consider an autonomous equation of order s, as in Eq. (2.6.1), 
and suppose that it is normal. It is possible to define, on the data space R$, 
a family (St)i>0 of C® maps of R into itself such that the function 

t> Siol E,- E670) = (x(t), xO (1), xDD) (2.6.8) 


is a solution to the equations 


x(t) = x(t), XV) = xO@Q@),..., XET (E) = xD (t) 


2 


(2.6.9) 
RONG) = f(x(t), x (t), . -xP @) 
[equivalent to Eq. (2.6.1)] and verifies the initial data 
x(to) = £o, x (to) = EM, ..., XTD (to) = EOD, (2.6.10) 
Furthermore, 
SiS = Ste, Vie 20 (2.6.11) 
and the maps S; are C® regular also, jointly in t and (€o,... ,Bg-0); 1.€., 


the map (t, €0,..-,€s—1) > Si(Eo,---,§s—1), with (t,€o,---,§s—1) € R4 x 
ROS cc RE is in COO (RES RI x... RI). 
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PROOF. It is an immediate consequence of the equivalence between Eqs. 
(2.6.1), (2.6.2) and Eqs. (2.6.9), (2.6.10) and of Proposition 8. 
mbe 


7 Definition. Given a normal s-th order autonomous differential equation 
on RÌ, the family (St)t>0o of maps of the data space into itself, defined in 
Proposition 8, will be called the “flow” on R? which “solves Eqs. (2.6.1) and 
(2.6.2)”. 


Observations. 


(1) Because of Eq. (2.6.11), the flow (S;)¢>0 is, in mathematical language, a 
“semigroup”. When Eq. (2.6.1) is also normal in the past, it becomes possible 
to define S, for t < 0, and the family (S;)zeR forms a group, i.e., it verifies 
Eq. (2.6.11) for all t,t’ € R (exercise). 

(2) All the normal reversible equations are also normal in the past (exercise); 
hence, such a class of equations provides an important instance when the 
solution flow is a group. 


An interesting remark about autonomous equations, already met in Prob- 
lem 3, §2.2, is the following proposition. 


10 Proposition. Consider a normal s-th order autonomous differential equa- 
tion on R4, like Eq. (2.6.1). Suppose that (€, W, aes -=D is an initial 
datum such that there is some T > 0 for which 


0) (1 s—1 0) (1 s—1 
Sr( k TEES ) =( ©) és ere ay (2.6.12) 
then the motion generated by ( (O) 6), Rey aly is a “periodic motion” with 
period T, i.e., it is a periodic solution of Eq. (2.6.1) with period T. 


PROOF. The function t > S,47( Oe... EFD), t > 0, where (St)t>0 
is the solution flow to Eq. (2.6.1), is again a solution to Eq. (2.6.1) and, for 
t = 0, verifies the initial condition (€, 9: ia (=D) by our assumption 
Eq. (2.6.12). Hence, by uniqueness, it coincides with t > SEP, an ery, 
This means that t — SE, estes (=D) is periodic with period T. mbe 
Observation. More generally, it is clear that t > Sree, eg E) is a 
solution to Eq. (2.6.9) for t > 0: i.e., if t — x(t) is a solution to an autonomous 
equation, t > x(t + T) is also a solution for T € R. 


2.6.1 Exercises and Problems 


Show that the following equations are normal both in the past and in the future and: 


1. Draw the trajectories of the flow (St)ter in the data space R? for the equation # = 
—-9, JER. 


2. Same as Problem 1 for # = —w?2, w? > 0 
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3. Same as Problem 1 for = —g — Az, g,AE R. 
4, Same as Problem 1 for % = wg, w? > 0. 


5. Describe the trajectories of the flow (St)rer in the data space RÊ for the equations in 


R3: ¥ = =g or ¥ = —w?x. 


6. Same as Problem 5 for the equation, in R?, t = ax+y,y = —x + ay, discussing the 
result in terms of a E R. 


7. If Eq. (2.6.1) is normal and reversible, prove that it is also normal in the past. Show that 
the flow (St)4>0, solving it for t > 0, can be extended to a flow (St)zer, solving Eq. (2.6.1) 
for all t € R: one can define S_~; = Ses Vt > 0. In this case, the family (St)¢e forms a 
group of maps of RIS onto itself. 


2.7 One-Dimensional Conservative Periodic and 
Aperiodic Motions 


Having completed a general survey on existence, uniqueness, and regularity 
properties of ordinary differential equations, let us go back to the qualitative 
theory of motions developing under the action of a purely positional force f 
considered in §2.1 [see Eq. (2.1.1)]. For such motions the energy conservation 
theorem was derived (so that they are called “conservative motions” generated 
by “conservative forces”). The analysis will now concern another qualitative 
property and study under which circumstances the motions are periodic or 
aperiodic. 

Let V be the potential energy generating the force f (i.e., f(€) = -FE 
and, in order to have motions defined for all times (“globally defined mo- 
tions”), suppose that V is bounded below (see Proposition 6). Let (no, ĉo) € 
R?, to E R, and let t — E(t), t > to, be the solution to Eq. (2.1.1) with data: 


i(to) =no, x(to) = £o. OTL) 


If E = Zne + V (£o), we can represent graphically the initial datum and the 
potential as in Fig.2.1. 

If éo, as in the picture, is between two contiguous and distinct solutions 
x_(E) < 24(E) of V(é) = E and if ¥(ax_(E)) < 0, % (s+ (E)) > 0, then 
by energy conservation the motion t — x(t) will never leave the interval 
|x- (E), x4(E)]. In fact V would be strictly larger than E to the left of x- (E) 
and to the right of x} (E) and, therefore, the motion with energy E would 
have to have negative kinetic energy when occupying such a position. 

Such a trapped motion will be periodic if and only if it takes a finite time 
for it to run from x_(E) to x4(£). This amount of time can be estimated 
easily by the quadrature formula (2.1.8), p. 12. 

If x- (E) or x4(E) or both do not exist, the above argument says that 
the motion may be unbounded. The above argument also does not give any 
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precise predictions when the derivative of V vanishes in at least one of the 
two points z_(F£),74(£). 

The following proposition provides a general result and in its proof all the 
above problems are implicitly or explicitly solved. 


Figure 2.1.: Two contiguous roots of V(x) = E. 


11 Proposition. The motion t > x(t), t € [to, +00) of më = —¥ (x) with 


initial datum (2.7.1) is periodic with a positive minimal period if and only if 
Eo is between two adjacent roots x— < x} of V(é) = E, where the derivative 
of V is, respectively, negative and positive. 


PROOF. Suppose that V(#i) = E, —V'(a_) and V’(x4) > 0, and V(é) < E 
for € € (w_, x4) and let ĉo € [v_, x4]. As already noticed it must be that, Vt > 
to, x(t) € [x_, x+]. Suppose, first, that no > 0 and define t} = {supremum 
of the values t > to such that (T) > 0 for all r € [to,t)}. From energy 
conservation, one deduces: 


a(t) =+ 2g —~V(a(t))), to <t<ts, (2.7.2) 


where the sign in front of the square root comes from the continuity of (t) 
and from ņo > 0. To estimate t4, remark that Eq. (2.7.2) implies: 


a(t) d 
t—to= f ea to < t< ty, (2.7.3) 
0 


JE -vE 


If we show that lim;—+, x(t) = z4, it will follow from Eq. (2.7.3) that 


oy dé 
nahej ee (2.7.4) 
“R JZE- VO) 


which can be used to estimate t} and to conclude that t} < +00: i.e., the 
point reaches x+ in a finite time. 

Once the point reaches x4, it cannot stay there since f (x4) = -4 (x4) < 
0 and, therefore, (t4) < 0. This means that (t) < 0, x(t) < x+ in a right- 
hand neighborhood of t} , by Lagrange’s theorem. We can then repeat the 
already used argument to deduce that: 
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2 
a(t) =-\/—(E-V(e()), Vte ltt), (2.7.5) 
where t_ is analogously defined as t_ = {supremum of the values t > t} such 


that (t) < 0 for all 7 € (t,,t)}. Proceeding as before, we shall show that 


a 
z- ./2(E—V(é)) 


The same arguments can be again repeated and, therefore, after a suitable 
time t — t_, the point will again go through £o with positive velocity and 


J dé 
z- Z(E- Vie) 
By Proposition 10, p. 35, from now on the motion will identically repeat itself: 


i.e., z(t + T) = x(t), Vt > to, if T is the sum of the time intervals of Eqs. 
(2.7.4), (2.7.6), and (2.7.7): 


Ly 
T=2 Í — &_ 
z- 4/5 VCE) 
hence, the motion will be periodic and T will be, by construction, its minimal 
period. It remains to show that lim,;.;, x(t) = z+ and that t} < +00. 

Since a(t) > 0,Vt € [to,t+), the limit lim;—:, z(t) = T exists and 
it is approached monotonically. Then, if © < x4, it would follow that 
T = lime, i(t) = (2 (E — v@))? > 0; hence, «(t) would be > 0 in the 
right-hand neighborhood of t, if t} < +00, against the very definition of t+ or, 
if t} = +00, this would mean that T = +00 against T < x4. Hence, T = x4 
and Eq. (2.7.4) holds. 

To show that Eq. (2.7.4) also implies t+ < +00, apply Lagrange’s theorem 
to infer that there is a point X € (£o, x+) such that for all € € (Z, x4): 


(2.7.6) 


(2.7.7) 


(2.7.8) 


parges vep- (ey) (€- 24) = AWE) az) 


because E = V(x) and f(x4}) = ae (z4) < 0 and (E — V(€)) — (E - 
V(a+))— f(x+)(€ — x+) is infinitesimal of higher order in (£— z4) as € > x4. 
Therefore, 


t+ — to E eee eee. (2.7.10) 
o VF(E-V() “o yE- r4) 


since the first integral is finite because max(2(E — V(é))/m)~! < +o in 
léo, z], while the second integral is also finite (an explicit computation). The 
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alternatives initially set aside, namely no < 0 or no = 0 (i.e., & = x+) are 
reduced to the one just treated. 

Finally, the cases f(x) = 0, or f(a_) = 0, or f(x+) = f(a_) = 0, or r+ , 
or x_ not existing have to be discussed and shown to give rise to motions not 
periodic or with period 0. This last case is realized if (and only if) no = 0 and 
f(&) = 0: one says that € is an “equilibrium point”. Among the remaining 
cases consider, as an example, the case no > 0, f(x+) = —(dV/dE) (x+) = 0. 
Proceeding as before, it is found that t} is still given by Eq. (2.7.4). This 
time, however, to estimate t} Eq. (2.7.9) must be improved by using Taylor’s 
formula to second order, since f(x+) = 0. If f'(x+) = (df/dE)(x4), it is: 


E- V(€) = Ff (es) - 24)? + 0((€-24)") (2.7.11) 


because the left-hand side vanishes together with its first derivative in r+. 
Hence IF’ € [€o, x+) such that, if f'(x) > 0, 


B-V(€) < f(a) — 24)? + 0((E — 24°) (2.7.12) 
Thus, if f'(x) > 0, we deduce from Eqs. (2.7.4) and (2.7.2): 
t4 — to > n Ee = z4)?) 7? d = +00. (2.7.13) 


The case f'(x) = 0 is treated likewise, as, in this case, Æ — V (£) is infinites- 
imal of higher than second order in € — x, and an inequality like Eq. (2.7.12) 
holds, therefore, with f’(x+) replaced, say, by 1. 

The case f'(x}) < 0 is impossible if no > 0 (since this would mean that 
x4 is a minimum for V). mbe 


For future reference let us state the following obvious proposition. 


12 Proposition. If £o € R, the constant function t —> x(t) = & solves to Eq. 
(2.1.1) if and only if £o is a stationary point for the potential energy V. 


2.7.1 Exercises and Problems 


1. Estimate the period of the motions indicated below with an error rigorously bounded by 
60%: 


x =a(x — 1), x(0) = 0, “(0) = a or 5, 
i=- 2g — 4r, 2(0)=1/V2,  (0)=0 
pesg x(0) = 0, &(0) = 1, 


Peles: sO) =: #0) = 5) 


Plog tee: aay 5, (0) = 0. 
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wa, 2 — t+ dx ine 1 T4 dx = T 
(Hint: Show first: Ja (x4 a ene Ot < To Q(E) i (a2,—2)(a@—-2_) Q(E)’ 
max /@ 


A Rarer : I-I 
where € is any point in [xz—, x+], with an error 6 = | 7 ol < [p28 1|.) 
2. Find, if they exist, values of Æ to which correspond aperiodic motions for the equations 
i: =y? i ? 
in Problem 1, and for: ¢ = —xe `? ; £z = — SINT. 


3. Same as Problem 1 with an error rigorously bounded by 10% or 1%, using a desk 
computer. 


4. Find whether the motions associated with the second equation in Problem 1 admit a 
motion with period T = 10 and, if it exists, estimate within 20% the amplitude of such a 
motion. 


5. Show that the period of the motion of total energy E verifying & = —a has a period 
1 
T(E) proportional to E` 7 if the potential energy is defined as V(E) = at. Show that 


1 t 
the proportionality constant is 2 Jea (1 = ea) 2d€ (Hint: Write the formula of quadrature 
1 
for T, Eq. (2.7.8), and change variable as E — E7 7.) 


6. Show that the period of the motion with energy E verifying më = —(dV/dx) (x), 

with V such that V(0) = 0, v'(0) = 0, vV” (0) > 0, is such that limg—o+ T(E) = 
L 

2n( a0) ? | (Hint: see hint to problem 1). 


7. Let E —> V(é) be a C° convex even function vanishing at the origin. Let 


1 e 1 
=o E? ay 5 


Vio =5 


(sup VEN) E 


Consider a motion, associated with the potential energy V, having total energy Æ. Show 
that its period is larger than the period of the motions with potential energy V. 


8. Suppose that V (£) = IES; Q > 1, and show that the period of the motion with energy 
Lol 
E is proportional to H’«~ 2 (see Problem 5). 


9. Find the limit as E — +00 the period of the motion with energy E developing with 
i — 1¢2 1¢4 
potential energy V(§) = mS + HE 


10. Same as Problem 9 with V such that V (E) = V(-6), limg—oo 2 = +00. 


11. Same as Problem 9 with V such that V(E) = V(-§), limg—o0 vs) = 0, 


2.8 Equilibrium: Stability in the Absence of Friction 


In the proof of Proposition 12, p. 39, it has been remarked that stationary 
solutions of më = f(x), i.e., solutions like t + ĉo = constant, correspond to 
the stationary points of the potential energy function V. In such positions, 
“equilibrium positions” , the exerted force vanishes. It is also possible to further 
distinguish the equilibrium points on the basis of a qualitative property: the 
stability of their equilibria. Stability is an empirical notion susceptible to 
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assuming different precise meanings, depending on the particular problem 
where it appears necessary to study the stability of an equilibrium point. 

It is therefore useful to provide several different definitions of stability for 
an equilibrium point, leaving to the imagination of the reader the identification 
of different types of problems for which such types of notions might be relevant. 
A deeper analysis of the stability notion will be found in Chapter 5, which is 
entirely devoted to stability theory. 

In the following, xo shall denote an equilibrium point for më = f(x) under 
the assumption that f is generated by a C% potential V bounded from below 
(so that the equation of motion is normal, see Proposition 6, p.29). 


8 Definition. xo is a stable equilibrium position if there is a function € > 
a(e) < +00 defined for e > 0 and infinitesimal as € — 0, such that every 
motion following an initial condition x(0) = xo , |&(0)| < € has the property: 


|x(t) — zo| < a(€), Vt>0 (2.8.1) 


Observations. 


(1) In other words, xo is a stable equilibrium position if a point mass placed 
in xo with small velocity stays indefinitely close to x9 and the smaller «(0), 
the closer it will stay. 

(2) The fact that a(e) might be +oo means that we admit the possibility 
that initial data whose velocity «(0) is too large may originate motions which 
travel indefinitely far from xo. Equation (2.8.1) is really a condition which is 
relevant only for € small. 

(3) The choice of to = 0 as initial time is irrelevant since the equation of 
motion is autonomous. 


In most applications it is by no means sufficient to know that xo is a stable 
equilibrium position in the sense of Definition 8. For instance, it is sometimes 
necessary that the function a(e), which could be called the “tolerance” func- 
tion, has a preassigned structure. This leads to the following definition: 


9 Definition. Given a function of the variable € > 0, € — b(e) < +œ (not 
necessarily infinitesimal as € — 0), one says that zo is a stable equilibrium 
position “with tolerance b” if the motion t > a(t), t > 0, following an initial 
condition «(0) = zo, |&(0)| < £ is such that 


|x(t) — xo| < b(€), Vt>0. (2.8.2) 


Observations. 


(1) Definition 9 differs from Definition 8 because € — b(e) is a priori given 
and also because b(e) is not necessarily infinitesimal as € > 0. 
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‘ 


(2) Obviously one can also give other analogous definitions where the “per- 


turbed” initial data look like 7(0) = xo + €,#(0) = 0, or some other. 


Avoiding formalization of the possibilities hidden in observation 2, some 
stability criteria will be discussed. A well-known simple criterion for stability 
in the sense of Definition 8 is stated in the following proposition and it will 
suggest studying a third stability definition involving the introduction of novel 
interesting ideas, see §2.9. 


13 Proposition. If xo is a strict minimum for the potential energy function 
V, then xo is a stable equilibrium point in the sense of the Definition 8. 


PROOF. Let Es = 4me? + V (xo) be the total energy of the initial datum 
x(0) = zo, (0) = £. By assumption, £o is a point of strict minimum for V, 
see Fig. 2.2, i.e., V (£) > V (xo) if € Æ xo and |E — xo] is small enough; hence 
it is possible to define the positions x— <, and £+}, e, which are the first root of 
E — V (E) = 0 to the left or to the right of xo, respectively. It is also easy to 
check that the strict minimum assumption also implies that 


lim T4 e = Xo (2.8.3) 


e—-0 


and also that £} į and w_.- are, respectively, monotonically increasing and 
decreasing in £. For large £, it might happen that Es — V(&) does not have 
one of the two roots £x, —,€ or £4, or both. In this case define x- = —oo or 
The = +00. 


Te Xo Tie 
Fig.2.2: A minimum of the potential and the two points £e. 


Then if one sets 


a(e) = max |£o e — Lol, (2.8.4) 
Eq. (2.8.1) is verified for the motions t — x(t) such that x(0) = 0, |t(0)| < €, 
by the arguments of Proposition 11, p. 37, mbe 


Observations. 


(1) The proof method of Proposition 13 allowed us to define, in fact, the 
minimal tolerance function; i.e., Eq. (2.8.4). It is therefore easy to provide 
stability criteria in the sense of Definition 9 by using the preceding proof, 
under the assumptions of Proposition 13. 

(2) Note that if £¥ (x0) > 0 the function a(¢) in Eq. (2.8.4) is of O(e). 
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2.8.1 Exercises and Problems 


1. Determine the stable equilibrium positions in the sense of Definition 9 with tolerance 
functions: 


b(e) = 3e fore < + 
ble) =+00 fore > H, 


We)=5+e or { 


for a unit mass point acted upon by a force with potential energy 


VE) =EE-1), or log(1+€2), or—sing, or Zeer 


2. Show that not all the stable equilibrium positions for V (£) = (sin €2)e-& have tolerance 
b(e) = 4 ife < 1 and b(e) = +œ if e > 1. 


3.* Show that the potential energy V defined by 


2 sa vd: 
VE) = eSI (E + (sinz)?), E40 


and V(0) = 0 has infinitely many stable equilibrium positions in the sense of Definition 8. 


(Hint: Show that V’(£) is infinitely many times positive and negative near zero.) 


2.9 Stability and Friction 


A further alternative definition of an equilibrium point 29 for a force law 
f € C™(R), with potential energy V bounded from below, comes from the 
remark that, in practice, when xo is a stable equilibrium position, then, under 
a small perturbation of the equilibrium state, the point mass moves away 
from zo to return eventually to xo with essentially zero velocity. As it really 
happens when a pendulum is slightly deflected from its equilibrium position. 

To give a mathematically precise meaning to the stability criterion that 
seems to emerge from these considerations, it is necessary to formulate a 
precise definition of the term “friction”. 

An accurate analysis of the friction phenomenon could be found in physics 
and engineering textbooks: here it will be enough to remark that, empirically, 
a friction force acts “against the motion”; then one understands why a math- 
ematical model for a friction force is that of a force law depending on the 
position x and, mainly, on the velocity t of the point mass in such a way to 
have a sign systematically opposite to that of «. 

The simplest model describes the friction force A in terms of a nonnegative 
C@ function (N, £) > a(n, £) on R? as: 


A(t, x) = —ta(«, x) (2.9.1) 


with a verifying the further property that a(n, €) 4 0 for 7 Æ 0; i.e., friction 
is absent only if the point is standing still. There are, however, phenomena 
for which this is not a good model, like the so-called “static friction” cases 
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(which are modeled by discontinuous friction forces). Remarkable examples 
are: “linear friction”, 


Alt, z) = Àt, A>; (2.9.2) 
“cubic” friction, 

A(t, z) = —àt (1+ 0'27), CSO (2.9.3) 
and “quadratic friction” , 

A(é,x) = —d\e(14+N4?)?7, AN >0 (2.9.4) 


The following stability notion can then be formulated 


10 Definition. If xo is an equilibrium point for mï = f(x), it will be said 
“strongly stable” if for small enough £ the motions t > a(t), t > 0, with initial 
data x(0) = xo, 4(0) =e and described by the (normal) equation 


mk = —A\t + f(x) (2.9.5) 


are such that 


lim x(t) = zo, và>0 (2.9.6) 


t— +0 


Observation. In other words, this means that xo is strongly stable if, in the 
presence of an arbitrarily small friction, an initial datum z(0) = zo, #(0) = € 
produces a motion returning asymptotically to xo, at least if £ is not too large. 
The following is a stability criterion in the new sense. 


14 Proposition. Let xo be an equilibrium point for mz = — We) with 


V € C™(R) bounded from below. Suppose that for E — zo Æ 0 and small 


2 
enough, the derivative —f'(€) = os is positive (“strict convexity of V at 


xo”); then xo is a strongly stable equilibrium point. 
Observations. 


(1) The condition on V is verified if, for instance, V has a strict minimum in 
zo and not all its derivatives vanish in zo. 
(2) The function V defined to be 0 for € = 0 and, for € 4 0: 


V(E) = eM (e? + (sin 5?) (2.9.7) 


is a potential energy function to which the criterion of Proposition 14 cannot 
be applied. One can see that, actually, Eq. (2.9.7) provides a counterexam- 
ple to the thought that might flash that the above strong stability notion is 
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equivalent to the one of Definition 8. The origin is, in fact, a stable equilib- 
rium position because of Proposition 13, p. 42, but it is not a strongly stable 
equilibrium point. 

(3) The proof of Proposition 14 is a particular case of quite general technique 
adaptable to the analysis of various stability problems as it will be seen again 
in Chapter 5. 


PROOF. Intuitively it can be expected that, in presence of friction, energy is 
no longer conserved: it will, indeed, be shown that the energy of the motion 
t — x(t), solution to Eq. (2.9.5), defined as E(t) ba) tma(t)? + V(a(t)),t > 0, 
is a non constant function of t, such that limy... E(t) = Eo = V (xo). Since 
V(€) > V(xo) = Eo and in the vicinity of xo there is just one point, namely 
Xo, where V(£) = Eo, it must follow that limp++4.. x(t) = xo, if € is small. 
To study the energy variation, with time, of a motion verifying Eq. (2.9.5), 


compute its derivative: 


d _ d ,mè(t)2 
m 


+ V(a(t))) = t (më — f(x)) = —At? <0 (2.9.8) 


which shows that, in presence of linear friction, energy is nonincreasing (and 
strictly decreasing when the velocity does not vanish). Therefore, the limit 
E» = lim E(t) > inf V(£) > —co exists. (2.9.9) 
t—o0o EER 
Since zo is, by the assumption on f’, a strict minimum point, there are (if € 
is small enough) two points £4}, and z— <, to the right of xo and to the left of 
Xo, respectively, that cannot be bypassed by the motion with Eq. (2.9.5) and 
initial datum 2(0) = zo, %(0) = £, because E(t) < E(0), Vt > 0. Figure 2.3 
eloquently illustrates this, making it unnecessary to expound further details. 


Figure 2.3 


Te 1 T0 p72 THe 


Fig.2.3: Decrease in energy as function of time in presence of friction. 

Suppose that £ has been chosen so small that, as in Fig.2.3, f (£) A 0 if £ F 
0,6 < € < £4: this is possible by the supposed structure of the minimum 
of V in xg. Per absurdum, let FE > Eo, as in Fig.2.3. Remark first that, as 
t > +00, limp. x(t) = £ must exist. Otherwise, if ©; = liminfy.~ x(t) < 
T2 = limsup;_,,, x(t), there would be an interval [a,b] C (21,22), where 
minge|a,b] (Hoo — V(E)) > 0, see Fig.2.3. Such an interval would have to be run 


46 2 Qualitative Aspects of One-Dimensional Motion 


infinitely many times as t —> +00 , since %; and Tə are limit points for x(t); 
furthermore, when the point mass is in [a, b], its velocity is neither too small 
nor too large: 


(E(t) — V(a(t))) 2 f— (2.9.10) 


2 
= (E(0) — Eo). (2.9.11) 
m 

Therefore, every time the point mass enters [a, b], it spends therein at least a 
time T: 


m 

T = (b-—a),/—————~- > 0 2.9.12 

"-9y 250-5) a 

by Eq. (2.9.11) and, therefore [see Eq. (2.9.8)], it loses an amount of energy 
given, at least, by 


2 
a eee (2.9.13) 
m 


Hence, after infinitely many passages through [a,b], the energy should become 
E» = —o0, but E > Eo. Thus the limit T = lim; x(t) exists and % must 
be one of the two abscissae of the intersections of E», with the graph of V, 
i.e., in Fig.2.3, one of the two points £1, or Tə. Otherwise, limp. x(t) = 
+(2(Ex —V(@)))? 40 and z(t) could not have a finite limit.” This, in turn, 
implies that limy.4.. #(t) = 0. 

The last property is, however, in contradiction with the equations of mo- 
tion (2.9.5) which would imply that 


=~ 40, (2.9.14) 


i.e., that the limit as t —> +00 of %(t) could not be finite while we proved it to 
be zero. Hence, Es cannot be larger than Eo, and, then, as already remarked 
at the beginning of this proof, limp.4.. x(t) = xo. mbe 


2.9.1 Exercises and Problems 


1. Show that the equation for the energy variation versus the position is, for the motions 
verifying mz + At + V'(x) = 0, given by: 


TE 2) = £\/— (le) - Vie) 


dx m 


5 Exercise: If lim;—+oo f(t) and lim+_.+.0 f/(t) exist, then lim;—+oo f’(t) = 0 (denoting 
f! the derivative of f). 
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2.* Consider the motions described by the equations: 


mi + At + V' (x) = 0, mä + Aż + W'(x) =0, 


with (0) = (0) = 0, (0) = y(0) = zo and suppose that for ro < € < xı one has 
0 < =W” (£) < -V’(é). 

Denote vz(€) and vy(£) the velocity of the motions x and y, respectively, at their passage 
through € € [xo, x1] and suppose also that it is known that <(t), y(t) are non-negative for 
all the times preceding the (respective) time of first passage through 71. 

Show that vz(€) > vy(€), VE € [zo, 21]. (Hint: Use the result of Problem 1 to deduce from 


val) = VEE) — V(@))/m: 
d 2 a er 
gO? -wO = (Mee) — wE) — VO + W'S). 


This proves that (d/d£) (vx (£)? — vy(€)?) > 0 for € > xo and close enough to xo; hence, for 
such €’s, vz (E) > vy(E). If there existed £ € (xo, 21] where vz(€) = vy(€) we could consider 
the smallest among them: still call it €. Then (d/d€)(vz (€)? — vy(€))? < 0, since £ is the 
first point where vz(€) = vy(€); but this contradicts the above equation for vz (£)? — vy (£)? 


since Uz (€) = vy(€) while —V’(€) + W’(€) > 0.) 

3.* Consider the case analogous to the one in Problem 2 with initial datum (0) = y(0) = 
vo > 0. 

4.* Formulate and prove results analogous to Problems 2 and 3, when vo < 0,0 < W' (£) < 
v'(é). 


5. Consider the equation @+A%— f (£) = 0, x(0) = 0, (0) = 1 or (0) = —1/2/15. Determine 
the limit, as t + +00 , of z(t) for A = 50 and for f with potential energy V(€) = €7(1+€)?. 


6. Same as Problem 5 for A = 10 and x(0) = 0, (0) = 10. 

7. Same as Problem 5 for V(€) = (£? — 1)(€ + 2),2(0) = 3 and «(0) = 0,A = 4 or 
V(E) = E? (E + 1)(€ + 2), A= 1, 2(0) = 0, ż¿(0) = — V2. 

8. How large should à be so that the motion verifying # = —« + V' (x), with V (E) 
lee”, (0) = 0, (0) = 10, is attracted by the origin? (Find a lower bound only.) 


9. Show that for A small enough, the motion in Problem 8 “runs away”, i.e., lim¢.4.0 x(t) = 
+oo. For such a motion, after an arbitrary choice of A, estimate the time necessary to reach 


the point with abscissa € = 10. (Find an upper and a lower bound.) 


2.10 Period and Amplitude: Harmonic Oscillators 


In this section a point with mass m is considered subject to a force law f 
generated by a C% potential energy V such that 


(7) Vi) VSS); 


(ii) oa #0, ¿#0 (2.10.1) 
(iii) _lim_V(€) = +00 


In $2.7 it was proved that all motions are periodic with period T < +o. 
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We now ask whether there exist potential energy functions V verifying 
Eq. (2.10.1) and generating motions with energy-independent (or amplitude- 
independent) period. 

It is well known that the “elastic energy” V(€) = V(0) + $k€? generates 
“isochronous” motions of period T = 27,\/%, constant as the total energy 
varies (“harmonic oscillations” ). 

It is remarkable that, in the class (2.10.1)this isochrony is a characteristic 
property of the harmonic oscillators, ([28]). 


15 Proposition. If all motions developing under the action of a force with 
potential V verifying Eq. (2.10.1) have the same period, 1k > 0 such that 


V(é) = ake +V(0). (2.10.2) 


Fig.2.4: A potential satisfying Eq.2.10.1. 


Observation. Using the idea involved in the following proof, it is also possible 
to treat the case when V does not verify (i). See the observations following 
Corollary 16 below. 


PROOF. Let E be the energy of the motion associated with the potential of Eq. 
(2.10.1) and let «(£) be the corresponding amplitude (#(£) = x} with the 
notations of §2.7, see Fig. 2.4). The period of this motion is [see Eq. (2.7.8)] 


naji —— Se. (2.10.3) 
o y AE- (E) 


Since V is monotonically increasing in € for € > 0, the inverse function to the 
function V can be defined. Denote it by v > €(v), defined for v € [V (0), +00) 
and such that V(&(v)) = v, Vv € [V (0), +20). The second relation in Eq. 
(2.10.1) implies that v — ¿(v) is in C™((V(0),+00)), say, by the implicit 
function theorem (see Appendix G). 

Changing coordinates in Eq. (2.10.4), € — &(v), we find: 


T(E) = i E a (2.10.4) 
V(0) [2(E — V())/m]? 
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where £'(v) is the derivative of €(v) with respect to v. Note that £’ (v) diverges 
as v — V(0), but the divergence is summable in Eq. (2.10.4). Supposing 
E — T(E) known for E € (V(0), +00), Eq. (2.10.4) becomes an equation for 
€(v) which can be solved through the following artifice. Multiply Eq. (2.10.4) 
by (b— E)~? and integrate both sides between V(0) and b (assuming that the 
arbitrary parameter b is chosen larger than V(0)): 


ee JE ENEI ef NEVE, (E — v)(b— e) 


“Ef, ei j A 


The integral in the last parenthesis can be explicitly computed and its value 
is m (V v,b!). Hence, 


(2.10.5) 


°? T(E)dE | m = m 
ee 4r FE — €(V(0))) = infet (2.10.6) 


This formula is interesting in itself since it provides the expression of the 
potential energy “as a function of the period” for all V’s verifying Eq. (2.10.1). 
When T(E) = T = constant, Y E € (V (0), +00), Eq. (2.10.6) yields 


Eb) = Pa b—V(0) (2.10.7) 


which, remembering the definition of €(b), means that 


1 T 


5 mi E2 + V(0) (2.10.8) 


V(g) = 


mbe 
The remark after Eq. (2.10.6) provides the following corollary. 


16 Corollary. Let E — T(E), E € (V (0), +00), be the period of the motions 
with energy E developing under the action of a potential verifying Eq. (2.10.1) 
and let V(0) =0. Then V is given by 


[ore a Be vex: (2.10.9) 


(1) In the above proof, it is necessary that V(€) = V(—€): if V verifies only 
(ii) and (iii) of Eq. (2.10.1), then Eq. (2.10.3) is no longer correct and should 
be replaced by 


Observations. 
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x+(B) 2 1 

T(E) = 2 | (4 (E -V (6) dE, (2.10.10) 
x- (E) mM 

where x4 (E), x- (E) are the roots of E — V (£) = 0 [uniquely defined by Eq. 

(2.10.1), (ii) and (iii)]. Proceeding as in the proof of Propositions 15 and 16, 

after splitting Eq. (2.10.10) into two integrals like Eq. (2.10.3) between x- (E) 

and 0 and between 0 and x+(£), it follows: 


2 _ [? T(B)dE 


determining x, (b) — ab) in terms of the period function. 

(2) Therefore, because of observation (1), there are infinitely many C% func- 
tions € — V(€) verifying (ii) and (iii) and leading to motions with energy- 
independent period. They can be visualized by saying that their graphs are 
obtained by horizontally deforming the parabolae of Eq. (2.10.8), keeping fixed 
the distances between the values x- (E) and z4 (E) such that V (x+ (E)) = E. 
Hence a necessary and sufficient condition that V, verifying (ii) and (iii) of 
Eq. (2.10.1), generates isochronous periodic motions is that for all E > V(0), 


v,(E) —«_(E) =k’ /E—V(0) (2.10.12) 


for some k’ > 0. 
(3) Note that Eq. (2.10.9) does not, in general, imply that there is a V € 
C™(R) verifying it for arbitrarily given E — T(E) (see the problems below). 


2.10.1 Exercises and Problems 


1. Determine the potential V verifying Eq. (2.10.1) and V(0) = 0 such that T(E) is (1+ E) 
or (1+ E?) or log(1 + E); check whether V € C®(R) or V € C®(R/0). 


2. Let E — T(E) > 0 be a C® function defined for E > 0. Suppose that T(E) = 
To (1+ oR y T,E*) for E small enough and suppose |t,| < o! for some ọ > 0. Show that 
€(b) in Eq. (2.10.6) is given by 


osnih Sn fr) 


for b small enough. 


3. In the context of Problem 2, using the implicit functions theorem (see Appendix G) to 
invert the function 


_— 1 gd 
e=TV(1+> Evin f =) 
2 et o0 (1-— zx)? 
to obtain V as a function of €? for €? small, show that there is a V € C®(R) verifying 
Eq. (2.10.1) and producing motions with energy E whose period is T(E) for all E small 
enough; assume 72m = 1. 
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4.* Let E — T(E) > 0 be a C™ function defined for E > 0. Show that given N > 0, 
there is a C% function Ay such that the function €(b) in Eq. (2.10.6) can be expressed as, 
assuming 7V2m = 1 and E small 


N 
&(b) = To Vb (1 +) Feb + DNH Ay 41(6)) 
k=1 


where 71,...,7y are suitably chosen constants and Tp = T(0). (Hint: Use the Lagrange- 
Taylor expansion to order N on the left-hand side of Eq. (2.10.6) to express T(E) (see 
Appendix B).) 


5.* Using the result of Problem 4, indicate which, among the following functions E — T(E), 
cannot be the function giving the periods of the motions with energy E of some even C'° 


potential: Æ (1+ £), E 1 + (cos E)?, E 14 (2X2), E 14 t (sin VE)?, 


E > 14sinhVE, E > 1+ log(1 + E), E > 1 + log(1 + VE). (Hint: The problem 
is essentially whether the function (2.10.6) really can be used to obtain b (i.e., V) as a 
function of € which, also, is C.) 


6.* Let V — €(V) be defined by 


T(x V) dx 
wae Ta 

obtained from Eq. (2.10.6) by setting b = V, V(0) = 0, and changing the integration 
variable. Assume that E — T(E) is a positive C% function of E € [0, +00). Show that a 
necessary and sufficient condition for the existence of a potential V verifying Eq. (2.10.1) 
and producing motions with energy E > 0 with period T(E) is that €(V) Voter t, 
é'(V) > 0, VE > 0. Show also that this happens if T'(E) > 0,VE > 0, and does not 
necessarily happen if one only supposes T(E) bounded for E > 0; show, however, that such 
conditions are only sufficient conditions. (Hint: for V near 0 the analysis is in problems 2 
through 5 above; if for some Vo it is (Vo) = 0 the inverse function € — V (€) cannot be 
C© while if €’(Vo) < 0 it cannot be globally defined for £ € R. If T(E) is only supposed 
bounded a counterexample is T(E) = 1+ € cos 5 for € small enough.) 


1 
é(V) = im f 


7. Let V € CO)(R) verify (i), (ii), and (iii) of Eq. (2.10.1) and suppose that V € 
C™((—oo, 0) U (0, +20)) and V(0) = 0. Define t — x(t), t > 0 to be a motion gener- 
ated by V if « is a C@) function verifying ż(t)? + V(a(t)) = E > 0 and ż(t) changes sign 
to the right and to the left of any time t when «(t) = 0. Show that any initial datum (0), 
x(0) gives rise to a unique motion generated by V and respecting the datum, if E > 0. 


8.* Show that if in Problem 6 one only drops the condition T(E) > 0 replacing it by 
T(E) > 0, one has the same results, provided V is allowed to vary in the class of potentials 
considered in Problem 7. 


9. Find a calculation algorithm for the tabulation of a function € — V (£) which generates 
motions with period log(1 + VE) with 30% accuracy as E varies in the interval 4 < E < 10. 
(Hint: Define T(E) “arbitrarily” for E ¢ [4,10] and use Eq. (2.10.6).) 


10. Using a desk computer, actually perform the calculations in Problem 9, drawing (on the 
screen) the graph of V (without tabulating it) and the graph of the amplitude z(E), E € 
[4, 10]. 
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2.11 The Damped oscillator: Euler’s Formulae 


In §2.7 we saw that the harmonic oscillator is a system with the absolutely 
remarkable property of exhibiting only periodic motions with the same period. 

In this section, and in the following, we shall examine other important 
properties of harmonic oscillators before dedicating some attention to the 
study of the stability of such properties with respect to “small” modifications 
of the force law. Consider a point mass with mass m > 0 whose motions are 
described by the equation 


mi(t) = —ka(t) — Ailt) + p(t), (2.11.1) 


where k > 0,a > 0, and y € C™(R) is a preassigned function. Equation 
(2.11.1) is a normal differential equation (see §2.5), as it can be readily verified 
by multiplying it by (t) and obtaining 


L BO = AE alol) < mala nytt) = err.) 


if E(t) = $ma(t)? + $kx(t)?. Hence, for all t > 0, we find the a priori estimate 


t 2 
E(t) = eee + tiai < E(0) +f ety” at (2.11.3) 
2 2 o 4A 
which implies normality by Proposition 5, p. 28. 
Motions described by Eq. (2.11.1) are called “forced oscillations” of a lin- 
early damped harmonic oscillator. In this section we shall study the case 
y =0, i.e., the equation 


më = —A\t — ka (2.11.4) 


describing linearly damped oscillators. 

The arguments used to prove the strong stability criterion, Proposition 
14, p. 44, can be adapted to the particular case of Eq. (2.11.4) and lead 
to conclude that its motions have a trivial asymptotic behavior as t — oo: 
limy— +00 x(t) = 0. 

Actually Eq. (2.11.4) can be “explicitly” solved and from the formulae of 
the solution one gets a very detailed description of the motions as shown by 
the following proposition. 


17 Proposition. Given (no, £0, to) € RÌ, there exist Ap, Ah in R such that 
the solution of Eq. (2.11.4) with initial datum 
&(to) = no, a(to) = £o (2.11.5) 


can be written as 


2.11 The Damped oscillator: Euler’s Formulae 53 
à _ Amk (4_ aX. —Amk (¢_ 
a(t) = e7 Paltto) (Age VITAE Eto) 4 Ale am V1 )) (9.11.6) 


if A? > 4mk; or as 


x(t) = e~ Fm (t—to) ( Ag cos -(1 2) (t to) 
(2.11.7) 
+ Agsin ,/£(1 — AQ) (t — to)) 
if A? < 4mk; or, if AÀ? = 4mk, as 
w(t) = e~ Im (tto) ( Ag + Ah (t — to)) (2.11.8) 


Observations. 


(1) Remark that limy.4.. x(t) = 0 exponentially fast for all solutions. 

(2) There are two time scales in the motions described above (they coincide if 
à? = 4mk). For small À (compared with V4mk), one time scale is 2m/ and 
the other is 21,/m/k and 2m/X >> 27,/m/k. The first time scale controls the 
damping (“friction time scale”) and the other controls the oscillatory motion 
(“proper time scale”) [see Eq. (2.11.7)]. 

(3) The above solutions can be continued to solutions of Eq. (2.11.4) on the 
entire time range. However, lim sup;_,_., |x(t)| = +œ unless x(t) = 0. 


PROOF. A possible proof is by direct verification, i.e., by inserting Eqs. 
(2.11.6)-(2.11.8) into Eq. (2.11.4) and by checking that in each case the initial 
data can be satisfied by suitably choosing Ao, Aj. We present a more instruc- 
tive proof which illustrates a general method and allows to introduce some 
new mathematical notions. Look for solutions of Eq. (2.11.4) having the form 
a(t) = Ae™, A#0 (2.11.9) 
By inserting Eq. (2.11.9) into Eq. (2.11.4), we see that in order that Eq. 
(2.11.9) be a solution it must be 
ma? + àa + k = 0; (2.11.10) 


hence, œ = a, or a= a_ with 


À 


Q+ = aa. + 2 ) (2.11.11) 
If \? > 4mk, there are no problems. For t € R, setting 
a(t) = Ape*+ to) + Aher- (to) (2.11.12) 


one obtains a solution of Eq. (2.11.4) for all Ap, Ap E€ R, since Eq. (2.11.4) 
is a linear homogeneous equation. Imposing the initial conditions yields the 
system 
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Eo = Ao + Ao, No = a4 Ao + a_ Ag (2.11.13) 
whose determinant is a,—a_ = à (1- amk )2 # 0. This proves the proposition 


if A? > 4mk. 

The case A? = 4mk can be obtained by first letting A? > 4mk, solving Eqs. 
(2.11.4) and (2.11.5), and taking the limit A? — 4mk and using the regularity 
theorem, Proposition 3, for differential equations. 

The determination of Ag and Ab from Eq. (2.11.13) gives, for A? > 4mk, 


e%+ (t-to) s e%-(t—to) a_e%+(t—to) — a e%-(t—to) 
x(t) = pa -io ių (2.11.14) 
Q+ — A ay — A 


which, as A? — 4mk, gives 


(Eo + (no + Eo) — to)) em PRO) (2.11.15) 


For A? < 4mk, the roots a+ are complex and Eq. (2.11.9) does not directly 
make sense. However, if we could give a meaning to the exponential of a 
complex number in such a way that the function t — e** has the properties 


d 
ae =e", VzEC (2.11.16) 


and, of course, e? = X> %o 2*/k! for z real, we could still take Eq. (2.11.14) 
as the solution to Eqs. (2.11.4) and (2.11.5). It is natural to define Vz € C 


z < K 
é =5 4 (2.11.17) 


since the series is absolutely convergent even if z is complex. 

It is then possible to check Eq. (2.11.16) by series differentiation of Eq. 
(2.11.17) with z replaced by zt: in fact, such a series can be differentiated 
term by term. Some remarkable properties of e” are 


, e 


(i) ee? = et, e =e (2.11.18) 


where the bar denotes complex conjugation. This property can be checked by 
series multiplication, as for z real, and by conjugation of the series. 


(ii) et = e? (cosy + i siny), V2,yER (2.11.19) 


which is checked by recalling the Taylor series for the sine and cosine: 


; l X (iy)® Ln MT (—1)ky2k+1 
erty — etet — ey (iy) = eX eee Deane (2.11.20) 
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(iii) By Eq. (2.11.19), one has 

ev +e : ere 

—, sin y = —————_ 
2 A 2i 


Hence, we see that Eq. (2.11.14) gives a solution to Eqs. (2.11.4) and 
(2.11.5), even if \? < 4mk, by interpreting the complex exponentials as given 
by Eqs. (2.11.17) and (2.11.19). Note that Eq. (2.11.14) defines a real function 
of t, as ay = @ and the coefficients of no, Eo in Eq. (2.11.14) are therefore 
real because of the second relation in Eq. (2.11.18). Since, by (2.11.19): 


at _,-sht k = A? 
Re e°+" =e 2m" cos 4/ za Teak t, 


cosy = (2.11.21) 


: (2.11.22) 
k À 
T ayt = -zt 43 f t 
me e` m" sin = ae , 
Eq (2.11.6) follows from Eqs. (2.11.14) and (2.11.22) mbe 


Observations 


(1) Using the representation (2.11.14) and the complex exponentials, the two 
cases à? > 4mk and A? < 4mk are formally unified. This is the first instance, 
among several that we shall meet, where the use of complex valued functions 
appears useful and simplifies formulae and calculations even in problems in 
which one is eventually only interested in “real-valued results” . 

(2) The formula: 


e*t'¥ — e? (cosy + i siny), VayeR (2.11.23) 


is called “Euler’s formula” and it will be widely used in the following. 
It is remarkable that the polar representation of a complex number z = 
o(cos@ + isin 0) becomes, because of Eq. (2.11.23): 


z= oe”, (2.11.24) 
and also |e’”| = 1, Vy € R is true and more generally: 
jet] = e”, VayeR (2.11.25) 


so that e7 £0, Vz EC. 


2.11.1 Exercises and Problems 


1. Through Euler’s formulae prove the “De Moivre formula”: i.e., show that for Vn > 0 
and n an integer, (cos@ + isin sin 0)” = (cos n8 + isinné). 


2. Through Euler’s formulae and the Newton binomial, show that for n > 0 and integer: 


56 2 Qualitative Aspects of One-Dimensional Motion 


(cos 0)” = = Hay = 2 C) cos(n — 2k)0. 


3. Study the analogue to Problem 2 for (sin 0)”. 


4. Via Euler’s formulae, compute 0 cos j@ using the addition formula for geometric 
series. 


5. Compute 


Qn do 20 dé 
J emn, i: (cos 0)” —, ne Z4. 
0 2T 0 2T 


using Euler’s formulae and Problem 2. 
6. Compute 


2T dé 27 dé 
f (sin 6)” —, f (sin 0)” (cos 0)” —, n,m E Z+. 
0 27 0 2T 


7. Find two linearly independent solutions of ž + «+a = 0 and compute their determinant 
w(t), t > 0 (see Problem 16 in §2.2). 


8. Consider the system of equations in R4: x = Lx, where L is a d x d matrix L = 
(€:3)i,j=1,...,4 With constant coefficients. Determine whether there are solutions having the 
form z(t) = e°tx(0). Which algebraic equation does a satisfy? Which equation does x(0) 
have to verify? (See also Appendices E and F). 


9. Apply the method suggested in Problem 8 to find two linearly independent solutions of 
&=axr+y, y =—x + ay and describe the flow (St)+>o in the data space as a varies. 


10. Compute the time interval between the n-th and the (n + 1)-th passage through the 


origin of the solutions of + t + x = 0 and %+ 4a +a = 0, in the limit n — +00. 


2.12 Forced Harmonic Oscillations in Presence of 
Friction 


We now consider Eq. (2.11.1) with y 4 0. Its motions are the “linearly damped 
harmonic oscillations with forcing term y”. 

An obvious but important remark about Eq. (2.11.1) is that its most 
general solution can be written as the sum of a particular solution t — 
Lpart(t), t > 0 of Eq. (2.11.1) and of a solution of Eq. (2.11.4), i.e., of the 
homogeneous equation associated with Eq. (2.11.1). In fact, the linearity of 
this equation provides that the difference between two of its solutions is a 
solution of Eq. (2.11.4). Hence, in formulae, a solution t > x(t), t > 0, of Eq. 
(2.11.1) can be written: 


w(t) = tpare(t) + xolt), (2.12.1) 


where t > xo(t) is a solution of Eq. (2.11.4). 
In 2.11 we saw that lim;_,4.. xo(t) = 0 and, furthermore, we found explicit 
expressions for the most general solution t — xo(t). Hence, the discussion of 
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the properties of the motions described by Eq. (2.11.1) is reduced to that of a 
particular solution of the same equation which we can choose as convenience 
suggests. This remark is particularly relevant whenever one is interested in 
the “asymptotic behavior” as t > +00 , where t > xo(t) is infinitesimal. 

Let us now describe a method for the construction of a particular solution 
to Eq. (2.11.1) valid in the interesting though special case when ¢ is periodic 
with period T > 0. 


18 Proposition. Let p E€ C™(R) be a real-valued periodic function with 
period T > 0. Then Eq. (2.11.1) admits a solution with the same period. 


Observation. Consequently, we can say that all the motions described by Eq. 
(2.11.1) with a periodic forcing term are “asymptotically periodic”: this means 
that there is a periodic solution t > £per (t), t E R+ of Eq. (2.11.1) ie that 


any other solution t — x(t) has the property |x(t) — tper(t)| age 


PROOF. First consider the apparently special cases 
PERRE ARRE ~ 
p(t) = cos at or y(t) = sin ath PER (2.12.2) 


and remark that they can be treated simultaneously by solving the equation 


më + At + kx = Gert (2.12.3) 


In fact, the real and imaginary parts of a solution to Eq. (2.12.3) are 
solutions to Eq. (2.11.1) with ọ given, respectively, by the first or the second 


solution of Eq. (2.11.2) as implied by Euler’s formulae Re ett = cos 2t 


Im eit = sin art. 

On the other. hand, remembering the properties of the complex exponen- 
tials (i.e., (d/dt)e*! = ze*"), Eq. (2.12.3) admits a particular periodic solution 
Get an t 
—m( FF)? + iA + ke 
Hence, the particular cases (2.12.2) are solved by the real and imaginary parts 

of Eq. (2.12.4), respectively. 

To analyze more general cases, linearity of Eq. (2.11.1) can be used again. 
If this equation is considered with right-hand side y E€ C® (R) or y € C™(R) 
and if t > x(t) and t > xy(t), t E€ R4, are particular solutions of it, then 
t— £,(t) + £y (t), t E R4, is a particular solution of Eq. (2.11.1) with right- 
hand side y + w. Consider, then, the case: 


Lper(t) = (2.12.4) 


> ge ee at + 3 PP sin Rt, (2.12.5) 


where FY, FS aw ,n = 0,1,2,...,N, are real constants. By Euler’s formulae, 
Eq. (2.11.5) can be written as: 
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N 
y(t) = oe Ft, (2.12.6) 
n=—N 
where n is defined by 
AD 2A(9) 
“A = n +n oe A 
Gn = P-n = L n>0; Go =@. (2.12.7) 


Hence, a particular solution of Eq. (2.11.1) with » given by Eq. (2.12.5) [or 
Eq. (2.12.6)] is 


N ise nt 


tpr = Y ——eee —____ (2.12.8) 


para —m(n)? + iàn? +k 


which is real since the addends in Eq. (2.12.8) with index n and —n are 
complex conjugates because of Eq. (2.12.7). So the proposition is proved when 
y is given by Eq. (2.12.5) or Eqs. (2.12.6) and (2.12.7). The same methods 
can be applied to the case when is given by: 


pt) = XO GrelF™', tER, with (2.12.9) 
n= 65,  n=0,1,... (2.12.10) 
provided the series (2.12.9) converges well enough so that the function t > 


Lper(t), t E€ R, defined by 


+oo 


ae » 294 
eT 
Zper(t) =X Yn 


—m(n)? +inf + ke 


nt 


(2.12.11) 


=O 


is of class C% and its first and second derivatives (at least) can be computed 
by summing the corresponding derivatives of the functions in Eq. (2.12.11). 

A simple sufficient condition for these properties is that there is a constant 
Cp such that 


C 
Dn) < s, =0,+1,+2,... 2.12.12 
el Saat” (2.12.12) 
for all p > 0 or, equivalently: 
lim |@,|(1+|n|?)=0, Vp>0 (2.12.13) 


If Eq. (2.12.12) holds, the series (2.12.9) is uniformly convergent together with 
the derivative series obtained by differentiating Eq. (2.12.9) term by term an 
arbitrary number of times. For instance, the series of the k-th derivatives of 
Eq. (2.12.9) is 
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Co 


3o Pa nj iEn, (2.12.14) 


n=— oo 


and its n-th term has a modulus bounded by 


elm, 27 n|* 
IPal(Fr)" < (o vi (2.12.15) 


by Eq. (2.12.12) and by |e’ ”t| = 1. The right-hand side of Eq. (2.12.15) is 
t independent and can also be summed over n if one chooses the (arbitrary) 
parameter p > k + 1. Then the series differentiation theorems guarantee that 
Eq. (2.12.9) is a C% function whose derivatives can be computed by “series 
differentiation” . 

Hence, the proposition is proved also when g is given by Eqs. (2.12.9) and 
(2.12.10) with n verifying Eq. (2.12.12), Vp > 0, i.e., with n decreasing 
faster than any power as n — oo 

The following very important proposition tells us that the last case con- 
sidered is, actually, the most general and therefore completes our proof. 


19 Proposition. Let T > 0 and p E€ C™(R) be a periodic function with 
period T. There exists a unique sequence (Gn)nez of complex numbers such 
that 


(i) Gr=G-n, n=0,1,2,...; (2.12.16) 
(ii) lim (1+ |n|)’|@n]=0, Wpe 4; (2.12.17) 
(ii) p= So GrelF™*, YEER. (2.12.18) 


The n are called the “harmonics” of p with respect to the period T and 


Lye ae 
(iv)@n = F plti F "dt, Yn € Z, (2.12.19) 
0 
and, finally, Y s =0,1,...: 
ds = 2mi \s jan 
= de ap) eT, vteR (2.12.20) 


Observations. 


(1) Equation (2.12.17) can also be read as: the sequence (%,)nez approaches 
zero, as n — oo, “faster than any power”. It is equivalent to Eq. (2.12.12). 
(2) Proposition 19 implies, via Eqs. (2.12.18) and (2.12.19), that two C% 
functions periodic with the same period T > 0 coincide if and only if all their 
harmonics relative to the period T coincide. 
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(3) Proposition 19 is a “structure theorem” on the C®-periodic functions on 
R: it is the “Fourier series theorem”. 

The proof of this proposition will be given in the next section and it will 
also conclude the proof of Proposition 18. 


2.13 Fourier’s series for C%-Periodic Functions 


Preliminary to the proof of Proposition 19, p. 59, remark that if a function 
t > y(t), t € R, is defined by Eq. (2.12.18) with (G,)nez verifying Eqs. 
(2.12.16) and (2.12.17), then y is necessarily a C% function, by the series 
differentiation theorem [see, also, the considerations concerning Eqs. (2.12.14) 
and (2.12.15)]. Furthermore, since Eq. (2.12.18) is, in this case, uniformly 
convergent: 


i “nrg — Sg A sea (2.13.1) 
i e (Oa = Pr ; e T .13. 


k=— o0 


by the interchangeability of the integration and the summation operations in 
uniformly convergent series. However: 


T : 
12% (n-k) dt _ 1 ifn=k 
f e-i% == T ey (2.13.2) 


as seen by explicit calculation of the integral. Relation (2.13.2) is often written 
T 
27 dt 
J eE nht = Snp (2.13.3) 
0 T 


n,k =0,1,+2,... with Onn = 1, dng =O if kk An. 
Substitution of Eq. (2.13.2) into Eq. (2.13.1) yields 


T a dt 
J e T ntt) = = On. nEeEZ (2.13.4) 
0 T 


which shows that if y has the form of Eq. (2.12.18) with (Gn)nez verifying 
Eq. (2.12.17), then the numbers n are uniquely determined by Eq. (2.12.19). 
If y is real, then Eq. (2.12.19) [or Eq. (2.13.4)] implies Eq. (2.12.16). 

The above considerations show the validity of an “inverse” proposition to 
Proposition 19 and motivate the validity of Eq. (2.12.19). They are also useful 
since they allow the introduction of the fundamental relation (2.13.3). 

We now give the proof of Proposition 19 §2.12, (“Fourier’s theorem” ). 


PROOF.. Let y E€ C®(R) bea real periodic function with period T > 0. Define 


a, eer: 
= z/ ct Frty()dt, nez (2.13.5) 
0 
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It is n = P-n because T is real. Hence, Eq (2.12.16) holds. To study the 
asymptotic behavior of n as n — oo, integrate Eq. (2.13.5) by parts. 


1 eiF nt T 1 T o-iFnt 
6, => [ot -= o t)dt 
Pn =F En” Fs F) A) 
E sem (2.13.6) 
€ , 
=— —s—  (t) dt 
F) my O 


where y’ denotes the first derivative of y, and the periodicity of y has beer 
used to eliminate the first term in the intermediate relation. 

Since y’ is also a T-periodic C™ function, and so are the higher derivatives 
g", yl”,...,(d?y/dt?) = p”, the relation Eq. (2.13.6) can be iterated by 
again integrating by parts. After p such steps, p = 0,1,2,... one finds: 


a 1 ifr —i2% nt „P 
(iF n) 0 
Hence, if 
a (p) 
Cp oman, le (t)|, (2.13.8) 


one has, Vp = 0,1,...: 
PAES E VneZ (2.13.9) 
~ Qn |nj? P 


which is equivalent to Eq. (2.12.17). 

It remains to prove Eq. (2.12.18) with n, n € Z, given by Eq. (2.13.5). 
In fact, the relation (2.12.20) is, as already remarked, a consequence of Eqs. 
(2.12.18) and (2.12.17). Consider the order N approximation to the series 
(2.12.18); we elaborate it by using Eq. (2.13.5): 


. (2.13.10) 


which is an identity, Vy E C® (R). 

The summation in the parenthesis in Eq. (2.13.10) is a C% function in t 
and 7, periodic in both variables with period T, and it has the value 2N + 1 
if r =t+mT, with m an integer. It also has the property 


N pee eee 
z/ (So fF" )dr=1, VEER (2.13.11) 
0 n=—N 
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which follows from Eq. (2.13.3) by changing t—7 into t’ and by using the men- 
tioned periodicity. Furthermore, the function in parenthesis in Eqs. (2.13.11) 
and (2.13.10) can be written as 


fk se ap n(t—7) y 3 eiF n (t-r) (2.13.12) 
and the oe sums can i ee NE as geometric sums with ratios 
et! n(t-7), After a few steps, the result is, for m integer, 

N = sin(N+4)24(t—r) 7 żt+mT 
5 eiT n (t-r) = { sin i 2r (t—T) (2.13.13) 
n=_N 14+2N for T =t+mT 


Coming back to Eq. (2.13.10) and using Eqs. (2.13.13) and (2.13.11): 


_ 1 [T sin(N + $)34(t — 7) 
2 fitnt = al =e O 
ote wii Fett) tol) -odr C1819 
1 sin(N + 4)22(t — 7) 
= y(t) af sn 22-7) (Y(T) — p(t)) dr 


Hence, to show Eq. (e2.13.8), we have to show that 


_ 1 [fT sin(N + 4)32(t-7) 7 
Noe A in egy (0) — p(t))dr =0 (2.13.15) 


The reason why this is true is the remark that, at fixed t and V m integer, the 
function 


L0) -e0 12a eS r 
OR ES PE E 


(2.13.16) 
ze (t) t=t+mT 


is just a particular C% function periodic with period t, so that Eq. (2.13.9), 
and Euler formulae, imply Eq. (2.13.15). The proof of periodicity property of 
p(T) is left as an exercise (see Problems 1-4 of this section). 

By the trigonometric addition formulae, the integral E. (2.13.16) then 
becomes 


T 
7 I (w) sin 2 N (t— T) + (Y(T) — v(t) cos = N(t— r)) dr (2.13.17) 


2.13 Fourier’s series for C'°-Periodic Functions 63 


It appears that this expression, via Euler’s formulae, is a linear combina- 
tion of four harmonics of order +N of the functions of the 7 variable T —> Y(T) 
and T — y(t) — p(T) which, as discussed above, are C% functions, periodic 
with period T. 

Hence, the integral in Eq. (2.13.17) must tend toward zero faster than any 
power of N as N — oo: in fact, the inequalities in Eq. (2.13.9) hold for an 
arbitrary T-periodic C% function. The same, then, occurs for Eq. (2.13.15), 
and (2.12.18) is proved. mbe 


2.13.1 Exercises and Problems 


1. Let f E CY(R), f(0) = 0. Define w(t) = fo t £0, and (0) = f’(0). By applying the 
Taylor-Lagrange theorem (see Appendix B), show that Y € C® (R). 


2. In the context of problem 1, show that for k = 0,1,..., 


k 
(k) (4) — > Guar (-1)* (k) (q) FRY (0) 
yr o=(5 ry) Vt#0, y™ (0) = (k+1) , 


where the superscript k denotes the kth derivative. (Hint: To check that y(¥) (t) is continuous 
at t = 0 (hence C™) remark that the expression in parenthesis is the evaluation of f(0) = 0 
by Taylor expansion to order k at the point t and evaluated at —t; hence vanishes to 


o(tk*?)). 
3. Show that if f,g E€ C® (R) and g(to) = 0, g'(to) #0, the function 


f() = f(to) £'(to) 
v(t) = , tAto, and (to) = 
g(t) g (to) 
is a C% function in the vicinity of to. 
4. If f E€ C©(R) and is periodic with period T, the function 
— f(t)) cos Z(t — 
po COO pt) ag oa 


sin F(t — 7) 
T- : 
=— f(t), ifr At+mT 
T 


if m is any integer, is a C™ function of r and it is periodic with period T. (Hint: Use 
Problem 3.) 


5. Using Eq. (2.12.19), compute the Fourier coefficient of order 0,1,—1 for the function 
ft)=(- $ cos t)—!, thinking of it as a periodic function with period 2r or 4r. 


6. Using the Taylor series for the function (1—£€)~*, compute the Fourier series coefficients 
of the complex-valued functions with period 2m: f(t) = (1—$e%)—! or f(t) = (1- ge), 
with m,n € Z. (Hint: (1 — 2)~* = WR (4°) (—2)*-) 


7. Let, Vz € C: sinz = eee cos z = ae Using the Taylor series for the expo- 
nential [see Eq. (2.11.17)], determine the Fourier Series coefficients of f(t) = sin ett or of 
f(t) = cose, t € R, as 2n-periodic functions or as 4r-periodic functions. 
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ayn 
8. Let, Vz E€ C, |z| < 1: log + z) = OR, ( D z”.. By using the Series expansion for 
the exponential, Eq. (2.11.17), show that exp(log(I + z)) = 1+ z and compute the Fourier 
transform of the 27-periodic function f(t) = — log(1 — et). 


9. Same as Problem 7 for f(t) = (1 — 4 cos a ee t E€ R. Estimate h up to 10%, i.e., 


find an expression for tas but estimate it only for n = 2. 


10. Compute the Fourier transform of f(t) a log(1 = 4 cos t) ass a 27-periodic 
function. Estimate fs up to %30. 


Aa 


11. Same as Problem 10 for f(t) = eS! Estimate up to 1% the quantities fo, f41.- 


12.* Show that all the functions in Problems 5-11 have an exponentially decaying Fourier 
transform. In each case give an estimate of the decay constant. 


13. Give an example of a C™ function, periodic with period 27 whose Fourier transform 
does not decay exponentially (Hint: First define the transform and then the function, as its 
sum.) 

int 
14. Show that the function f(t) = r ET is continuous, term by term differ- 
entiable, periodic together with its derivative and with period 27, but not C™. 


15.* Analyze critically the proof of §2.13 to deduce that if f is T-periodic continuous and 
piecewise differentiable with continuous bounded derivatives in each piece, then 


N 


PEA : E Zint 
f(t) a m x fne? 


A T — 22; sr ; A à : . 
with fn given by Jo fde T me If f is discontinuous but piecewise continuous with 


derivatives bounded and continuous in each piece, the preceding formula holds in every 


4, def ,. 
continuity point. In the discontinuity points, if f(t) aj lim;—+, f(r), the series sum is 
f(t+)+ f(t) ( 
2 


considering 0 and T as the same point from the point of view of the disconti- 
nuities). (Hint: To reduce the second part to the first, show the truth of the second part in 
the case of a function which takes just two values (i.e., which has only two discontinuities 
being otherwise constant). Then show that any function of the second type is a sum of a 
function of the first type plus a finite number of piecewise constant functions. Recall that a 
function f defined on the interval la, b] is piecewise continuous if a, b can be represented 
as a union of 7 closed intervals la, bı], laz, bə], eyes lan, bn] and, for every t = 1,..., n 
the function f coincides in the interior of [ai, bi] with a function fi continuous on the entire 


interval (ai, bi]: f may take arbitrary values at the extremes of each interval [ai, b;]). 


2.14 Nonlinear Oscillations. The Pendulum and its 
Forced Oscillations. Existence of Small Oscillations 


In the preceding sections we saw that the asymptotic period of a damped har- 
monic oscillator is identical to that of the forcing (§2.12). However, the notion 
of “linear” or “harmonic” oscillator is too rough a notion and, in applications, 
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a linear oscillator can only appear as a simplified model of some more complex 
entity. 

For instance, very often a linear oscillator appears as a model for the “small 
oscillations” of a system governed by a nonlinear equation: a prototype of these 
nonlinear systems is the pendulum. 

It is natural to ask the question of the stability of the properties of the 
solutions to certain classes of equations with respect to the variations of the 
equations themselves: in fact, it is clear that in applications one shall only 
“trust” the predictions which do not change by “slightly” changing the models 
themselves. This is because, as stressed in Chapter 1, there is no “absolutely 
valid model”. As an example of a motion-stability problem in the above sense, 
we shall now treat some questions concerning the pendulum forced motion; 
i.e. the motion governed by the (normal) equation: 


mé(t) + Att) + ksina(t) = f(t),  tERy (2.14.1) 


with A,m, and k > 0 and where f E€ C™(R) is a periodic function of period 
T> 0. 

In the following, it will be necessary to compare several motions, functions 
of t, and to fix the ideas we shall adopt, as a measure of magnitude on fa, b] c I 
of a function y € C(I) the quantity® 


lella = sup let) (2.14.2) 
te [a,b] 


We now ask if the motions of Eq. (2.14.1) have the following properties 
(1) If t > a(t),t E€ R4, is a motion described by Eq. (2.14.1) and if 
x(0),£(0),||f||z, are small enough, the motion has also oscillations of small 
amplitude (“existence of small oscillations” ). 

(2) When ||f||z,, is sufficiently small, Eq. (2.14.1) admits a solution with the 
same period of f. 

(3) As t + +00, every solution can be asymptotically confused with the 
periodic solution, in (2) above, provided such a solution exists and the data 
x(0),4(0) are small enough. 


In other words, we ask if the above three properties, which have been 
explicitly or implicitly checked for the forced linear oscillations without re- 
strictions on £(0),x(0),||f||z,, are still true in a nonlinear case, at least in 
the small oscillations regime. In this section we analyze problem (1) and in- 
troduce the following proposition which “solves” it: 


8 Obviously, there are other possible magnitude measures. Usually the “good” one is de- 
termined from the needs of the particular applications. Examples of other measures are 


b b 1 
A 2 2 
[ ier sup (OHO ( f OPa)’. 


66 2 Qualitative Aspects of One-Dimensional Motion 


20 Proposition. There exist constants y, y! > 0 such that if f E€ C@(R) (not 
necessarily periodic) and (xo, v0) E€ R?, the motion — x(t) described by Eq. 
(2.14.1) and following the initial data x(0) = xo, (0) = vo, verifies 


llællr, Sy(leol + lool + Iflle.) i£ ato] + v0 + [Ife <y. (2.14.3) 


Observations. 


(1) Equation (2.14.1) is just one example of a nonlinear equation, chosen 
among others for its historical and romantic importance. The results and 
methods that follow apply to much more general equations. The reader will 
recognize that, in the proof, the key point is that k sin € — kê; is infinitesimal, 
as € — 0, of higher order in £. As an exercise, the reader can, with the obvious 
modifications, repeat the proof that follows to investigate the validity of the 
statement identical to Proposition 20 for the equation më + At + k(x) = 
f(t), | > 0, under the sole assumptions that Yy € C®(R), y(0) = 0,¥’(0) = 
(dip/d§)(0) > 0. 

(2) To realize the necessity, in general, of the restriction on ||f||z, consider 
the equation 


màt +sing = Aw + sinwt, (2.14.4) 


whose solution, among others, t — wt is unbounded. However restrictions on 
£o, vo are not necessary. In other words, in Proposition 20, one could replace 
|zo| + [vol + ||flle, <7 with ||f||z, < y. We have imposed them only for 
the purpose of simplifying the proof. 

(3) The idea behind the proof is to “compare” the solution of Eq. (2.14.1) 
with the solution of a similar equation where sing is replaced by its first- 
order approximation, namely x. Such comparison will not be “direct”, but it 
will take place by rewriting ksing as kx + k(sinaw — x) and considering the 
function t > k(sin a(t) — x(t)) as a known function bounded by k|æ(t)|3/6, 
because of the inequality 0 < € — sin £ < €3/6, VEE Ry. 

In this way, one gets a linear equation with forcing term f(t) — k(sin x(t) — 
x(t)). Solving it “explicitly” (see the following proof), one finds a t-independent 
relation between the amplitude M(t) = maxo<,<z|a(t)| = ||2||[o,4) and its 
cube which, as we shall see, implies that M(t) must stay bounded, Vt E€ R4. 

This method of proof is a particular case of a general method to obtain a 
priori estimates on solutions of nonlinear equations dose to linear ones and, 
sometimes, it is called the “self-consistency” method. The reader should med- 
itate on the reason for this name after reading the following proof. The self- 
consistency method will be again used in this book, for instance in the proof 
of the Lyapunov stability criterion (see §5.4). 


PROOF. Assume, for simplicity, A? 4 4mk. Before analyzing Eq. (2.14.1), it 
is useful to remark that the equation 
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mi + àt + ke = F(t), t E Ry (2.14.5) 
with F € C®(R), admits, among its solutions defined for t > 0, the solution 


t-r) _ e2%-(t-7) dr 


t pot ( 
polt) =| 1 Fi(r)— (2.14.6) 


A+ -— AL m 


where a, and a_ are the two roots of ma? + àa + k = 0, i.e., 


a4 = A (14/1 - #8). (2.14.7) 


This property can be checked directly by inserting Eq. (2.14.6) into Eq. 
(2.14.5), and it is a special case of a general property of the linear differential 
equations which will be illustrated further through exercises and problems at 
the end of this section.” 

As already remarked in §2.12, the most general solution to Eq. (2.14.5) 
will have the form 


x(t) = T(t) + po(t), (2.14.8) 


where t — Z(t), t € R4 , solves Eq. (2.14.5) with F = 0. Note also that Eq. 
(2.14.6) implies that po is real valued, even when a+, are complex, provided 
F is real; also, 


po(0) =90, po(0) =0 (2.14.9) 


Coming back to Eq. (2.14.1) with initial conditions 7(0) = xo, (0) = vo, we 
rewrite it as 


m(t) + Ailt) + k(a(t) = f(t) + k (a(t) — sin z(t))), (2.14.10) 


and by the preceding remarks, pretending that the right-hand side is a “known 
function” of t, it is 


t e%+ (t=T) _ ea (t-7) T 
z(t) = z+ f 2 | f(r) +k (2(r)—sin z(7))] = (2.14.11) 


at — A 
where t — Z(t) is a solution to Eq. (2.14.5) with F = 0 and verifying [see Eq. 
(2.14.9) 
T(0) = 20, T(0) = vo. (2.14.12) 
From 82.12, it follows that , 
T Note also that if F is periodic, Eq. (2.14.6) will not be so, in general. Hence, this method 


for obtaining particular solutions to Eq. (2.14.5) is different from the one in §2.12, valid 
for periodic F’s and based on the Fourier series. 
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Z(t) = HOS S20 sah. Zo T ArT pot, (2.14.13) 
Q+ — AL Qp — AL 


and since Re a_ < Re a, < 0, it is je%+!| < e° +t < 1, Vt > 0: hence, 


los] + lo_| 


IZ|ley < ( gael Izo] + jlvol)- (2.14.14) 


2 
la+ — 
Setting M(t) = ||2||[o,4, we deduce from Eq. (2.14.11), using the inequality 


3 
0<€-siné< - VEER, (2.14.15) 


that, Vt > 0: 


= dr ne T) 
EOL < lell, + (flir, + ËM D| Z S emo 


Hence, by integration, 


Iz(t)| < [elle + UFR, + EMO) a a (2.14.17) 
which implies, by Eq. (2.14.14), 
eA < A+ BM(t)?, t20 (2.14.18) 
with 
2 m1 
Am (TE alt aa e a’ 419) 
an (2.14.20) 


~ 6mlRe alla} — a| 
It is then immediately seen from Eq. (2.14.18) that the continuity and mono- 
tonicity of M(t) = ||2||[0,4) and the arbitrariness of t > 0 imply 
M(t) < A+ BM(t), VtE Ri, (2.14.21) 
and from Eq. (2.14.19), it also follows that 


M(0) = |x(0)| = |ao| < A (2.14.22) 


To complete the proof remark that the graph of the function M — A+B M?— 
M has the form illustrated in Fig. 2.5 if 27BA? < 4. Hence, if |xo], [voh ||f|ley 
are small enough so that the latter inequality involving A and B holds [see 
Eqs. (2.14.19) and (2.14.20)], the equation A + B M? — M = 0 has three real 
roots uı (A), U2(A), u3(A), with 
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A+Bm?—m 


Fig.2.5: Illustration of the bound following from Eq. (2.14.21),(2.14.22). 

(A) < 0,0 < f(A) < (3B)~? < ps(A), see Fig.2.5. Furthermore, A + 
BM? — M > A—M for all M > 0: hence, p2(A) > A. Also, if M > 0, M < 
(3B)~?, it follows that 0 < A+.BM?—M < A+B(s4)M—M = A- 2M i.e., 
[12(A) < $A. So, concluding: 


3 
A< p2(A) < z4 (2.14.23) 
Since the function t > M(t), t E€ R4, is continuous and verifies Eqs. 
(2.14.21) and (2.14.22) and M(t) > 0, it must be 
3 
M(t) < m(A) < z4 (2.14.24) 


which concludes the proof. The constant 7 is determined by the condition 
27BA? < 4 and y by Eq. (2.14.24) recalling Eq. (2.14.19). mbe 


2.14.1 Exercises and Problems 


1. Consider the differential equation t = ax + f(t) and show that p(t) = A e% (t—=7) F(r)dr 
is a solution to it with initial datum p(0) = 0, Vf E C™(R), a ER. 


2. Let L be ad x d matrix with constant coefficients and consider the differential equation 
x = Lx + f(t), where f € C®(R) is an R?-valued function. Assume that L has d distinct 
eigenvalues A1,...,Aq with respective eigenvectors vil) 5 vl), ae v(d), Show that if for w € 
Ri, we denote a;(w),...,ag(w) the components of w on the basis v“,...,v( (see 
Appendix E), then 


t d 
p(t) = f S A-D a5 (£(r)) vO dr 
0 521 


is a particular solution to the equation, with p(0) = 0. (Hint: Note that Soy a; (£(r)) v 
= f(t) and check the validity of the equation by substitution.) 


3.* In the context of Problem 2, Let xO), CRAG x bed linearly independent solutions of the 
equation x = Lx with initial data z (0) = ôij, i,j =1,...,d. Let Wij (t) = al) (t), 2,9 = 
1,...,d: show that it is the matrix already introduced in Problems 7-9, §2.2 (“wronskian 


matrix”), verifying dW/dt = LW. (Hint: Use the differential equation verified by each row 
of W.) 
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4.* In the context of Problems 2 and 3, show that 


t 
t> p(t) = f W(t -— r)f(r)dr 
(0 
is a special solution to x = Lx + f(t) with initial datum x(0) = 0; i.e., it coincides with the 
one in Problem 2. 


5. Apply the method of Problem 2 to find particular solutions to the equation « = —x + 
y+ fit), y =- — y + f2(t). 


6. Same as Problem 4 for m#% + At + kx = f(t), after reducing it to a first-order system of 
equations. Consider the case f(t) = t. Show that such solution verifies x(0) = 0, (0) = 0. 


7. Same as Problem 4 for dtz/dtt — —d?x/dt? + x = t, after reducing it to a first-order 
system. Show that such solution verifies z(0) = 0, (0) = 0, #(0) = 0, z” (0) = 0. 


2.15 Damped Pendulum: Small Forced Oscillations 


We shall now show that the pendulum, as the damped linear oscillator, also 
admits periodic motions isochronous with the forcing term, at least if the 
oscillations are small. This solves the problem (2) posed in p. 65, §2.14 Again, 
the pendulum is selected only for definiteness. The theory developed below 
is valid for equations obtained from Eq. (2.14.1) by changing sin x into w(x), 
where w is an arbitrary C™ function such that (0) = 0, %’(0) > 0. 

Consider the normal equation 


më + At + ksing = y f(t), tE R4, (2.15.1) 


y E€ R, à, m, k > 0, A? 4 4mk (for sake of simplicity), then 


21 Proposition. Lett — f(t), tE R , be aC®™ periodic function with period 
T > 0. There exists a periodic motion with period T verifying Eq. (2.15.1), 
provided y is small enough. 


Observation. The proof below is based on a very general method used to treat 
such questions and relying on the implicit functions theorems. Together with 
Eq. (2.15.1), one considers the “linearized equation” 


më +àt+kr=7f, (2.15.2) 


which, as shown in §2.12, admits a periodic solution isochronous with f: 


+00 F, e Fint 
n 


t> F(t) == Y (2.15.3) 


—m(34)?n? + arin +k 


where (ee z are the harmonics of f. Then look for a periodic solution to 
Eq. (2.15.1) having the form 
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t— x(t) = yx(t) + y(t), tER (2.15.4) 


with initial data: 


y0) =e, gyO)=n (2.15.5) 


hoping to be able to show that ¢,7, and y exist and are infinitesimal as y — 0, 
of higher order in y (i.e., hoping that yz(t) is a very good approximation to 
x(t) for small y). The function t — x(t), solution of Eqs. (2.15.1) and (2.15.5), 
depends on ¢,7,y [in a C% way, by the regularity theorem (see Proposition 
3 and Problem 17 of §2.5)]; set 


a(T) =ye(T) +a(e,n,7),  &(T) = yE(T) + ble, n.7); (2.15.6) 


ie. y(T) = ale,n, Y), Y(T) = ble, 0,7) [see Eq. (2.15.4)]. Therefore, the condi- 
tion that Eq. (2.15.1) admits a periodic solution with period T can be written 
(see Proposition 12) as 


ale n, y) =E,  ble,n, y) =n, (2.15.7) 


since %(0) = &(T),x(0) = (T) by the periodicity of Z. 

So the problem of proving Proposition 21 is equivalent to proving the solubility 
of the implicit functions problem of expressing, from Eq. (2.15.7), € and 7 as 
functions of y for y small. 


PROOF.. Note that the functions file, n, y) = a(e,n,y) — £ and fo(e,n,y) = 
blen, vy) — n are C® functions. To study Eq. (2.15.7), write the equation 
verified by t — y(t) defined in Eq. (2.15.4): 


MÄE) + AYE) + k y(t) = k(qya(t) + y(t) — sin(y@(t) + y(t))), 
y(O)=«, yO) =n 


This equation and the uniqueness theorem for differential equations show that 
if e = 0,7 = 0,7 = 0, it follows that y(t) = 0, t E€ R4 and, therefore, 


(2.15.8) 


f:(0,0,0) =0, f2(0,0,0) =0 (2.15.9) 


It is then natural to look for solutions of Eq. (2.15.7) near y = 0 through the 
implicit functions theorem (see Appendix G). The solubility condition of Eq. 
(2.15.7) for small y is that the Jacobian matrix 


E eae a 


2.15.10 
32 (0,0,0) $Ë (0,0,0) l l 


has non vanishing determinant. To compute the derivatives in Eq. (2.15.10), 
recall that 
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a(e,, y) = y(T), b(e, 7, 7) = y(T), (2.15.11) 


where t — y(t) solves Eq. (2.15.8). Pretending that the right-hand side of Eq. 
(2.15.8) is a known function of t € R4, write 


t ect (t-7) = eo-(t-7) 


t) =y(t) + | ————————_- 
es J Ea j (2.15.12) 
k OZH + y(t) — sin(y@(r) + y(7))) =, 


along the same lines as the proof in the preceding section, where t — y(t) is 
a solution to 


met XG+kG=0, 70) =e, $0) =n, (2.15.13) 
[see Eq. (2.14.13)]: 


g(t) = PAM eart y SHE pot, (2.15.14) 
Ay — A Ay -— A 
Hence, 
Od pat oT _ at 
ale, y) =n E + e+ Oar (2.15.15) 
a+ 0 Ay -— A 
T jay(t-T) _ pa—(t-T) 
ent e M PEES dt 
+f — k I(t) + y(r) — sin(ye(r) + y(7))) —, 
0 A+ -— A m 


and a similar expression can be found for b by differentiating Eq. (2.15.12) 
with respect to t and setting t = T. From Eq. (2.15.15), we can compute 
the partial derivatives of a with respect to ¢,7,y in (0,0,0), without really 
knowing y(t) (remarkably enough). For instance: 


— (0,0,0) = 


Oa aet-F — q_e%? i [ 8 a — a_e%+(T-7) 
OE a, — A 0 


m at — A 


(2.15.16) 


age®-T = a_ettt 


-k(1 — cos y(T)) Sun) = 


where T > y(T), T > 0, is the solution to Eq. (2.15.8) with € =n = y = 0. 
Note that Qu (T) is unknown but is multiplied by zero and, therefore, it is not 
necessary to know it. Similarly: 


a+ — A 


By (0 0-9) =n (2.15.17) 
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Ob et TF — git Ob agel tT — gq e%-F 
— (0,0,0) = a,a_——____., — (0,0,0) = =&—— =, 
ðe a, — a On a} — aL 
hence, it is possible to write the matrix J and, with some patience, the alge- 
braic calculations lead to 


det J = (a,e%+? — 1)(e%-7 — 1) £0. (2.15.18) 


This completes the proof since the implicit functions theorem (Appendix G) 
implies that Eq. (2.15.7) can be uniquely solved for small y with e(7), n(y) of 
the order O(y). 

Actually, the implicit functions theorem implies that the derivatives of 
ely), n(y), with respect to y at y = 0 are proportional to the derivatives of 
fi, and f2 with respect to y in € = 7 = y = 0. Since such derivatives can be 
computed in the same way as those in Eqs. (2.15.16) and (2.15.17) and they 
turn out to be zero, it also follows that e(y), n(y) are of the order O(77) as 
expected. mbe 


2.15.1 Problems 


1. Show that the oscillator ¢+¢@+a2+2° = f(t), f € C®, has small oscillations in the sense 
of Proposition 20. Show that if f has the form f(t) = y y(t), g E€ R, and ¢ periodic with 
period T > 0, then for y small enough the equation admits a periodic solution with period 
T. (Hint.: Go through the proof of Proposition 21, replacing sina by x + x? everywhere.) 


2. Show that the motion + 4+ 23 = 0, 2(0) = 1,#(0) = 0 never goes through the 
origin as t — +00. How does this result depend on the datum? (Hint: From # + & + 
x = 2(1 — x?) write x(t) as in Eq. (2.15.12) which will imply x(t) > 0. To study other 


initial data xo use cB = —4/2(E (x) — 2), see problem 1, §2.9, and supposing that the 


first passage time is to = +00 deduce that this implies E(0) = fg? \/2(E(z) — zida < 


Jo? y 2(E() — zt dex < \/2E(0) zo, i.e. zo < 2V2 and infer that if xo > 2V2 the point 


passes through the origin). 


3. Same as Problem 2 for #+3¢+a+23 = O, x(0) 5, @(0) 0. (Hint: Write the equation 
as & + 34+ 2x = x(1 — x?) and follow the hint to problem 2). 


4.* Consider the oscillator # + «+ x3 = 0 and find the limit Tæ, as t > +00, of the 
time T(t) elapsing between the two consecutive passages through the origin with positive 
speed taking place after t. Show that it does not depend on the initial datum (Answer: 
T = 4n/V/3).(Hint: Let 21, 22,... be the successive maxima of the motion and let t1, t2,... 
be the corresponding times; call t1, t2,... the first passage times through the origin following 
ti,tg,..., respectively; then use Eq. (2.15.12) in the intervals [t1,t1], [t2,t2] and the fact 
that zi >= 0.) 


5.* Same as Problem 4 for + “+ tanha = 0; discover why Ts is the same as that in 


Problem 4. 


6. Examine critically the proof of Proposition 21 to see under which assumptions its con- 
clusions remain valid when A = 0. (Answer: If and only if det J # 0, i.e., if and only if the 
forcing period T is not an integer multiple of the “proper period To = 27,/m/k.) 
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2.16 Small Damping: Resonances 


We shall not study the problem (3), p.65, in detail since, in the next few 
sections, we shall conclude our analysis of the damped motions (in one di- 
mension) by an application where a similar, but more difficult, problem is 
analyzed. Let us simply formulate a result about problem (3), without proof: 


22 Proposition. Let f e C®(R) be a periodic function with period T > 0 
and consider the forced pendulum of Eq. (2.15.1). If y is small enough, the 
equation admits one periodic solution t > x,(t),t € R} , with period T, 
and every other solution t > x(t), t E€ R4}, with initial datum (xo, vo) with 
|xo| + |vo| small enough approaches, exponentially fast, the periodic solution: 
i.e., there are C > 0, u > 0 such that 


|a(t) — zp(t)| < Ce, VtE Ry. (2.16.1) 


Observations. 


(1) The proof of this proposition is very similar to that of Proposition 25 on 
the theory of the clock. The reader will reconstruct it from that proof. 

(2) Hence, the small oscillations of nonlinear damped oscillators are qualita- 
tively very similar to those of damped linear oscillators, at least if one is only 
concerned with properties (1), (2), and (3) selected for discussion at §2.14. 


So far, the presence of friction has revealed itself to be essential to the 
theory (see, however, Problem 6, §2.15). In fact something “goes wrong” as 
A — 0. This can be seen for the linear oscillators, as it will be briefly discussed 
in the following. This time, however, consideration of only harmonic oscillators 
will not just be “for simplicity” , but because only in this case will it be possible 
to obtain something without excessive conceptual and technical difficulties. 

In the nonlinear case, the discussion is, surprisingly at first sight, much 
more involved (and interesting) and, also, the results are unfortunately less 
detailed and complete than desirable for applications. Some basic ideas and 
technical tools will be developed in §5.9-§5.12 of Chapter 5. 

Actually, contrary to what is sometimes believed, the motion of mechanical 
systems is much simpler and stable when friction is present than when it 
is absent. When friction vanishes, the motion becomes very sensitive to the 
details of the equations of motion and to the initial data, as far as asymptotic 
behavior is concerned, in this way introducing new difficulties and peculiarly 
new phenomena. Also, from the mathematical point of view, the frictionless 
motion theory appears to be deep and rich with connections to the most 
diverse fundamental problems in analysis and geometry: from number theory 
to topology to probability theory. 


S However, at a deeper level of understanding, similar statements could also be made for 
dissipative systems: a glimpse of how complex they may become is given in §5.8. 
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Our discussion of the small friction case will be based on the following two 
linear (normal) equations: 
m + àt + kz = f (2.16.2) 
with à > 0,k >0,m > 0 and 


méit+ka =f, (2.16.3) 


where f is a C™ function periodic with period T > 0. The discussion will be 
restricted to the following simple proposition (for the time being). 


23 Proposition. Given xo, vo E R, let t > x(t), t E R} be the solution to 
Eq. (2.16.2) with initial data x) (0) = £o, tà (0) = vo. Let t > zo(t), t E R4, 
be the solution to Eq. (2.16.3) with data x(0) = zo, (0) = vo, the following 
results hold: 


(i) limag(t)=ao(t), VEE Ry. (2.16.4) 


ii) The preceding limit is “uniform as At — 0”: i.e., given £ > 0, there exist 
p g 9g , 
ôs > 0, Ae > O such that 


Irat) = rol <E YA < Ae, Vt<d.A7* (2.16.5) 


(iii) If T is not an integer multiple of the “proper period” To = 27,/m/k of 
the undamped free harmonic oscillator, one has 


oO 


2 
xolt) = Ao cos( =t + yo) + 5 
0 


n=— Co 


oan 2 
FT 
eee. tas 
“inl FY n+ 
where Ao, po are suitable constants and Cee are the harmonics of f on 
the period T: this is the “non resonant case”. 
(iv) If T = nT for some integer T: 


27 T fe Tnt 
xo(t) =Ao cos(—t + po) + — 
un -avones ons E A 
n¢in (2.16.7) 
a 2Ti nt 
+ 2Re Ine 
2i (qm 


This is the “resonant case”. 
Observations. 


(1) (ii) is particularly significant and says that the smaller the friction, the 
longer the time during which the friction-driven motion coincides, within a 
given approximation £, with the frictionless motion (this time being at least 
d-/A). Hence, (ii) strengthens and implies (i). 
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(2) The above proposition also illustrates the “resonance phenomenon”. By 
what has been seen in §2.12, the solution to Eq. (2.16.2) of interest to us is 


+00 T 22; 
4 a fae T 
x(t) — Axe +t + A_e -t + y ane ENTE (2.16.8) 
T T 


n=— oo 


where vo = t, (0), £o = xa (0) will determine the constants A;, A— and 


À . J4mk 
az =-2(1 +i -1) (2.16.9) 


From Eq. (2.16.8), it immediately follows that as t > +00, the asymptotic 
motion is T periodic and it is given by 


+00 f e Fint 
E(t) = —— eee 2.16.10 
a(t) D: —m(22)2n? + inà + k ( ) 
provided the first two “transient terms” in Eq. (2.16.8) are very small: i.e., 
provided At/2m > 1. 
If there is % such that T = 7p, select the two terms in Eq. (2.16.10) with 


n = +7 and rewrite them as 


F ent f, eint 
Ta = 2Re = + — (2.16.11) 
2i (4t)m later —m/(2)?n? + 2nd +k 
Setting fr = or e™, or > 0,07 € R, the first term becomes 
7 27 
sin(=#t + ôr 
Si ) (2.16.12) 


mono 
while the series in Eq. (2.16.11) can be bounded above uniformly in » by 


5 d (2.16.13) 


So if T = WT, for some integer 7, and if the force f is arbitrarily small 
but such that fr # 0, the motion impressed by f to the oscillator may attain 
an enormous amplitude, as Eqs. (2.16.11)-(2.16.13) show, for small A. 

If T/To is not integer but almost such, (T/To ~ 7 € Z), it will happen 
that the series of Eq. (2.16.10) will contain terms (those with n = +7) with 
denominators which, even though not vanishing as A — 0, will become very 
small producing two contributions to Eq. (2.16.10) that could “dominate” the 
others. 

(3) It should be stressed that resonance manifests itself only when the terms 
Ae*+* in Eq. (2.16.8) are small and, therefore, only if \t/2m >> 1. Hence, 
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although it is true that in the resonating linear oscillator (T = nTo, n € Z+) 
a very small force can produce huge oscillations (proportional to \~'), it is 
also true that the time it takes for this to happen is very large (proportional 
to A~'). Note also that 


(vo — 7a (0)) - a- (xo — 7a (0)) 
A+ —- A 


Ag = ASS (2.16.14) 
becomes singular as \ — 0, in resonance cases, because such are Z, (0), Z) (0) 
[see Eqs. (2.16.10) and (2.16.12)]. 

(4) Equations (2.16.6) and (2.16.7) give the most general solutions to Eq. 
(2.16.3) as Ap, yo vary arbitrarily. They show that when A = 0, the linear os- 
cillator motions are not longer periodic but, rather, are “sums” of two periodic 
motions with respective periods Tọ and T equal to the “proper” period of the 
free oscillator and to the period of the forcing force provided that T/To is not 
an integer. If T/To is integer, and if the harmonic component of order T/To 
of the force f does not vanish, the asymptotic motion is even unbounded: 
“undamped resonance”. 

Furthermore, in every case, the asymptotic motion depends on the initial da- 
tum (through Apo, yo). It is clear that the initial datum dependence surviving 
in the asymptotic regime is due to the absence of friction: analytically this 
appears via the fact that e%+t 4 0, since Re a+ = 0 if \=0. 

(5) The proof of Proposition 23 is a simple discussion of the limit as à — 0 
of the expressions (2.16.8) and (2.16.14). No problem arises in the absence of 
resonance. In the resonant case, the limit is most conveniently discussed by 
collecting together the first two terms in Eq. (2.16.8) and the two resonant 
terms in the series (2.16.8) (i.e., those with n = +7T/To). The calculations are 
straightforward and are left to the reader.’. 


2.16.1 Exercises and Problems 


1. Determine up to 20%, the asymptotic amplitude of the oscillations of the motions of 
+ rA¥e +a = f(t), f(t) = (1 —cos2at/T)—! for T = 1,47, v2. Which, in each case, 
are the resonant harmonics? (Call “resonance” a harmonic of order n € Z if the function 
¿E> GEE — (Ft)? takes its minimum between n and n + 1.) 


2. Determine the asymptotic amplitude of the motion described by +a = f(t) with f 
given as in Problem 1. 


3. Estimate how small A has to be taken so that the amplitude of the asymptotic oscillations 
described by #+ At +x = f(t), with f(t) = 10-3(1 — 107? cost)—!, is not smaller than 
A =1,10,107, 10°. 


4. Same as Problem 3 with f(t) = 1078 (1 — 0.99 cos t)™ t. 


° Note that (i) would also directly follow from the regularity theorem (Proposition 3,p. 
22, and problem 17, p. 32) 
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5. Write a computer program for the empirical solution (i.e., without error estimates) of 
the equation in Problem 1 with the purpose of drawing graphs, in the data space, of the 
trajectories corresponding to the various choices of the initial datum (using the computer 


screen and always avoiding tabulation of results). 


2.17 An Application: Construction of a Rigorously 
Periodic Oscillator in the Presence of Friction. The 
Anchor Escapement, Feedback Phenomena 


Che luna parte l’altra tira e urge 
Tin tin sonando con si dolce nota 
Che ’l ben disposto spirto d’amor turge 1° 


In §2.12 we saw that a damped harmonic oscillator can move exactly pe- 
riodically with a period equal to that of the forcing term. Furthermore, any 
of its motions differs from this periodic one by an amount which becomes 
exponentially small as t — +oo. It is natural to try to use this property to 
build a clock, i.e., a mechanism moving in a rigorously periodic fashion despite 
friction. However, the difficulty of producing a rigorously periodic force seems 
to be, at least, of the same order of magnitude as that of producing a periodic 
motion. 

The anchor escapement is a contrivance in a timepiece which controls the 
motion of the train of wheel work and through which the energy of the weight 
is delivered to the pendulum by means of impulses which keep the latter in 
vibration (see: Webster). 

This mechanism simultaneously solves the two problems of building a rig- 
orously periodic force and of inducing a rigorously periodic motion. It takes 
advantage of the presence of friction to cause the oscillator to move asymp- 
totically in a periodic way in the sense that the difference between the actual 
oscillator’s position x(t), at time t, and the position of a certain ideal peri- 
odic motion Zpe,(t) tends exponentially to zero as t = +00. A very schematic 
empirical description of the anchor escapement is the following. 

The “anchor” is a device set in motion by the oscillator as it passes through 
the point zo = 0, for instance, with positive velocity. At this instant, a notched 
wheel connected to a weight is liberated from a brake and starts moving. A 
little later, the notch of the wheel reaches the oscillator and accompanies 
it for a short while, exerting a push on it. Then the notched wheel loses 
contact with the oscillator, which remains free, allowing the wheel to return 


10 Tn basic English: 
That every part pulls another 
tin tin singing, so sweetly: 
that the well inclined spirit is filled with love. 
(Dante, Paradiso, Canto X). 
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to its original position by continuing its rotation. In this simplified scheme, the 
wheel has just one notch instead of the usual few dozens. In the meantime, the 
oscillator, now free, continues its (damped) oscillation, and the entire process 
starts afresh at the new passage through xg with positive speed. 

An attempt to schematize the just-described mechanical system is a motion 
governed by the equation: 


më + àt + kx = 0, x <0, (2.17.1) 
më + At + kx = f(x0,T) xz>0 (2.17.2) 


where the action of the notched wheel is schematized by a force f(to,T) 
depending upon the velocity to of the last passage through zo = 0 with 
positive speed and upon the time 7 elapsed since then. 

Note that Eqs. (2.17.1) and (2.17.2) are equations of motion quite different 
from the ones considered in the preceding sections. The force appearing in Eq. 
(2.17.2) not only depends on time but also upon the past history of the motion 
itself. Therefore it is not a differential equation in the sense of §2.2, Definition 
1. Consequently, we do not even know yet whether Eqs. (2.17.1) and (2.17.2) 
have a solution, i.e., a C% function t > a(t), t E R4, which turns Eqs. 
(2.17.1) and (2.17.2) into an identity (not even if f is a C% function of its 
arguments). 

To study Eqs. (2.17.1) and (2.17.2), it is useful to place some restrictions on 
the form of f which we intend to consider: i.e., it is useful to further specialize 
the model. This is done to avoid problems too complex from a technical point 
of view, as well as to avoid developing a theory for too general an f, which 
may not correspond to a force law that is reasonable for our problem. 

For the sake of example, let us assume that f(a, T) vanishes whenever to 
does not belong to an interval [v—, v+], 0 < v- < v4: 


to Z [v_,v4] > f(to, T) = 0. (2.17.3) 


The assumption corresponds to the fact that when the oscillator sweeps 
through zo = 0 too fast, it is never reached by the wheel’s notch; while if 
it sweeps too slowly, the amplitude of oscillation is too small to allow the 
oscillator to touch the notch. 

Assume, also, that once to € [v_,v;], the force on the oscillator only 
depends on the time 7 elapsed since the last passage through zo = 0 with 
positive speed; i.e., 


f (40,7) = P x(żo) g(7) (2.17.4) 


where y(%o) = 1 if to E€ [v_,v+], and x(to) = 0 otherwise, and t — g(t) > 0 
is a C% (R) function vanishing outside an interval [a, Ty], a > 0, Ty > 0 with 
a maximum equal to 1. The constant P, which we shall take as a positive 
adjustable parameter, models the “intensity” of the force. Physically, one can 
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imagine that P depends on the weight moving the notched wheel, while g is 
a detailed description of the wheel action. 

Therefore, Eq. (2.17.4) will be considered a mathematical model of the 
force generated by the anchor escapement. Such a model is only a schema- 
tization, where some of the properties of any real mechanism are certainly 
oversimplified. Nevertheless, as it will be shown, it is a model presenting some 
interesting characteristics such as, primarily, the “self-control” or “feedback” 
mechanism providing that the system (2.17.1), (2.17.2), and (2.17.4) “searches 
automatically”, in certain circumstances, for a situation of motion that allows 
it to move periodically. The function g has a graph like that in Fig.2.6. 

Some further properties which we must impose on g should be that g 
vanishes for T < a, for some a > 0, or for T > Ty > a > 0, and Ty should be 
small compared to the time necessary for an elongation of the oscillator from 
the position zp = 0 to the position of maximum distance from zo = 0. 

The time a > 0 is a mechanical constant representing the delay between 
the beginning of the wheel motion and the actual oscillator-notch contact. 
The čo independence of a is a strong idealization. 


g(T) 


a T, 
Fig.2.6: The force “per unit weight” due to the notch “engagement of the oscillator as a 


function of time elapsed since the last sweep through the origin. 


The physically obvious requirement on T, can be translated into math- 
ematical terms by requiring Tg < To/4, where To = 27,/m/k is the ideal 
oscillator period. This attempts to translate the fact that the notch has to 
detach itself from the oscillator before the latter starts swinging back toward 
the origin. Empirically, the condition T} < To/4 should guarantee this fact, 
at least if the friction is small so that it produces negligible effects for times 
of the order of To; i.e., as we saw in §2.12, if A? < 4mk. 

As a conclusion to the above considerations, assume as a model for the 
anchor escapement Eqs. (2.17.1), (2.17.2), and (2.17.4) with g as in Fig.2.6 
with 0 < a < Ty < Tp/4 and with \?2 < 4mk, and let us prove th following 
proposition, which begins the theory of the model. 


24 Proposition. Under suitable compatibility conditions between the param- 
eters P,v_,vz , Eqs. (2.17.1), (2.17.2), and (2.17.4) admit a periodic C% 
solution defined for t E€ R4. 


Observation. In the upcoming section we shall discuss the compatibility con- 
ditions by showing that they can be satisfied at least when A is small enough. 
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Later, in §2.19 we shall also show that when A is small enough and the com- 
patibility conditions are fulfilled, the motions with initial data close enough 
to those of the periodic motion become close to such motion exponentially 
fast (see Figs.2.7 and 2.8). 


PROOF. Let t — a(t) be a given periodic motion with period T > T} and such 
that «(0) = 0. Then the function defined, for t > 0, by 


p(t) = f(vo, T) = P x(vo)g(7) (2.17.5) 
where vo is the velocity of the given motion at its last passage through the 
origin, with positive speed and before time t, and 7 is the time elapsed since 
such time, is a C'°°-periodic function of t, provided the time necessary to 
return to 0 with positive speed is equal to T itself.!! 

Assuming, then, that t > x(t), t E€ R4, is a C™%-periodic motion verifying 
Eqs. (2.17.1), (2.17.2), and (2.17.4) and period T equal to its first return time 
to the origin with positive speed and assuming that vo = (0) € [v_, v4] we 
shall have, Vt > 0, 


milt) + A&(t) +k a(t) = (t). (2.17.6) 
Since y(t) = P g(t), Vt € [0, T,], and if 
T T} 
PE f otni = | ge Ferd, (217.7) 
recalling that T, < T, it follows (see §2.12) 
+00 J, e Fint 
t)= P — 2.17.8 


us Figure 2.7 


Figure 2.7. Graph of a periodic solution t — (a(t), «(t)) of Eqs. (2.14.1), (2.14.2), and 


(2.14.4) with convenient choices of the arbitrary parameters and of the function g. 


and the series is uniformly convergent because Jn approaches zero as n —> 
oo faster than any power (being the Fourier transform of the C%-periodic 


11 Tt could a priori happen that the motion sweeps through the origin more than twice 
(even infinitely many times) in an interval of time equal to the period T. 
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function y). Of course, we still have to determine T and to check that for such 
T, Eq. (2.17.8) is really a solution to Eqs. (2.17.1), (2.17.2), and (2.17.4). In 
other words, we must impose the condition that Eq. (2.17.8) is such that 


(i) x(0) = 0; (2.17.9) 
(it) «(0) € [v_, v4]; (2.17.10) 
(iii) T >T; (2.17.11) 
(iv) T is the first return time in 0 with positive velocity. (2.17.12) 


Relation (2.17.9) is an equation for the period T: 


+oo ee 

Jn 
0= c M 2.17.13 
Š, EP ATE ee 


and it should be noted that in this equation, T also appears in the coefficients 
Gn [see Eq. (2.17.7)]. 

Then if the parameters v_,v;,P,T are such that Eq. (2.17.13) admits at 
least one solution T and if with this choice of T Eq. (2.14.8) verifies the 
Ax 


Figure 2.8 


Figure 2.8. Graph of a solution t — (a(t), &(t)) with initial datum chosen arbitrarily: it 
becomes indistinguishable from the periodic solution of Fig.2.7 within a few oscillations 


(three in the precision of the drawing). 


compatibility conditions Eq. (2.17.10)-(2.17.12), it follows that Eq. (2.17.8) is 
a T-periodic solution to Eqs. (2.17.1), (2.17.2), and (2.17.4). mbe 


2.17.1 Exercises 


1. Choose arbitrarily a function g and m,k,A,v_,v4 > 0 and write a computer program 
providing a heuristic (i.e., without error estimate) solution to Eqs. (2.17.1), (2.17.2), and 
(2.17.4) in which P and the datum «(0) are left as free parameters. The output of the 
program should be a graph like those in Figs. 2.7 and 2.8. 


2. Run the above program on a desk computer plotting on the screen the results and finding, 


by trial and error, a value of P yielding a nontrivial periodic motion. 
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2.18 Compatibility Conditions for the Anchor 
Escapement 


This section, as well as the next, will suppose some maturity on the reader’s 
part and, therefore, on first reading it would be appropriate to skip the proof 
of this section and to read the next section only up to the beginning of the 
proof of Proposition 26. 

As promised in the previous section, it will now be shown that if A is 
small enough, given v_,v,g with T, < To/4, it is possible to fix P so that 
Eq. (2.17.8) actually verifies the four compatibility conditions Eq. (2.17.9)- 
2.17.12) and, therefore, is a periodic solution to Eqs. (2.17.1), (2.17.2), and 
(2.17.4), i.e., to the equation for the anchor-escapement model. 

Consider Eq. (2.17.13) as an equation for T~! parameterized by À > 0, 
and let us find some of its solutions having the form 


T 1 =T)'(1+A8) (2.18.1) 


suggested by the idea that for small A the oscillator may oscillate with a 
periodic motion with period close to the period To = 27,/m/k of the fric- 
tionless oscillator. The equation for T, Eq. (2.17.13), then becomes, after 
explicitly separating out of the series sum the two complex conjugate terms 
with n = +1: 


a 
—m(FE)?(1 + AB)? + FiA (1 + AB) + z) 
rae 3 (2.18.2) 
T 2 = m(22)?n? + sind + k 


n#+1 


which, using To = 27,/m/k, becomes 


gı 
0 =2Re § ———_————_————— 
j = k(2B-+ OAA + AIAL E 
3 (2.18.3) 
x y= m(22)?n + Sind +k 
n#Ł+1 


We see that for small A the first of the above two addends shows a small 
denominator. To avoid having to study an eduatioi with small denominators; 
multiply Eq. (2.18.3) by \. Then, Y (A, 8), A8 > —4 (so that Ty '(1+A8) > 0), 
it is possible to define 
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def gı 
ERED E 
8) ; EEEE 
an r (2.18.4) 
A E 
T 2 —m(2)?2n? + Sind +k 
nft1 


where T~! = Ty 1 (1 + A8) and it is perhaps worth recalling that g, are also 
T dependent. Rewrite Eq. (2.18.3) as 


H(A, 8) =0 (2.18.5) 


with the additional restriction AG > -4 [to be amply sure that the denom- 
inators in Eq. (2.18.4) do not vanish]. To study Eq. (2.18.5), note that the 
equation (0, 3) = 0 leads to 


a 
2Re ———~— =0 2.18.6 
© Lok + BE eae 
which, defining 

WT) Refi, e(T') “Imi, (2.18.7) 

has as a solution the quantity 6o: 

1 27 c(Tp* 

piso (2.18.8) 


~ 2k To WTS") 
provided that b(T,') 4 0. Note also that b(Ty +) 4 0 as it follows from [see 
Eq. (2.17.7)] 


1 Ts dr 2T 1 To dr 27 
(To =) Ax g(T) cos =T, c(Ty =) — g(r) sin —r, (2.18.9) 
0 To To 0 To 


thanks to the assumption T, < To/4 which implies that for T € (0,7,), the 
sine and cosine in Eq. (2.18.9) are positive. 
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The above remarks give hope for the existence of a solution to Eq. (2.18.5) 
having the form 


B(A) = Bo + O(A), (2.18.10) 


at least if À is small. If this were true, the velocity (0) could be computed 
from Eq. (2.17.8): 


+00 A 
2min Gn 
0) =P 
MOP DT amin + i ee 
P gı 
= [Re —— I 
X | ° CKB FAG) + Zi (2.18.11) 


Qri Gn 
+ So Sn. Ss |. 
T -m(n + inik 


Hence, Eqs. (2.18.11) and (2.18.10) would imply, with some algebra (and 
patience), 


(0) => [2Re { 


Qn ib(T> ') — c(T> *) 


2P 
To —2kbo + 35 


(oT) + O()), 
(2.18.12) 


}+0()] = 


having used Eq. (2.18.8). 

Therefore, if A is so small that in Eq. (2.18.12) |O(A)| < 0(Tj'), it is 
(0) # 0, and P can be so chosen that «(0) € [v_,vi]. Note that P — 0 
proportionally to À, as A — 0, if one imposes (0) € [v_, v4]: this agrees with 
the obvious empirical observation that the “weight” necessary to move the 
oscillator must be small in proportion to friction. 

Similarly, starting from Eqs. (2.18.7) and (2.18.10), one could check Eq. 
(2.17.12) for small À. It can in fact be seen that it is enough to verify Eq. 
(2.17.12) by replacing x(t) in Eq. (2.17.8) with the only contributions to the 
series (2.17.8) coming from the n = +1 terms (which in the preceding discus- 
sion seem to be the only important ones for small A, as far as the computation 
of T and of «(0) are concerned). For such an approximation to x(t), the state- 
ment of Eq. (2.17.12) is, however, obvious since such an approximate motion 
is a harmonic motion with period T. Elaboration of the details is left to the 
reader. 

Finally, Eq. (2.17.11) would also immediately follow from Eqs. (2.18.1) 
and (2.18.10) for small A. 


The above analysis can be summarized in the following proposition. 


25 Proposition. If Eq. (2.18.5), as an equation for B parameterized by A, 
admits a solution having the form of Eq. (2.18.10) for X small enough, then 
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the equation for the anchor-escapement model [Eqs. (2.17.1), (2.17.2), and 
(2.17.4)] admits a periodic solution with period T such that: 


T! = T9 (1 + BoA + O(A)) (2.18.13) 
if A is sufficiently small and if P is suitably chosen. 


Therefore, to complete the solution of our question, it only remains to 
verity that Eq. (2.18.5) does indeed admit a solution 8 like Eq. (2.18.10) for 
small À. 

A pair (À, 8) verifying Eq. (2.18.5) is already known, namely the pair 
(0, Go); hence it is natural to try to treat Eq. (2.18.5) through the implicit 
function theorem (see Appendix G). By this theorem, it will be enough to 
check that the function ®, Eq. (2.18.4), defined in the open set of R? containing 
the points (A, 3) such that A8 > —t, is of class C'° in its domain of definition 
and has a first-order derivative with respect to @ such that 


Of 
3g 0 Bo) #0. (2.18.14) 


In this case, Eq. (2.18.5) will admit a solution 3 for A small enough, like 
(2/0A) (0, Go) 
(38/38) (0, o) 


To see that ® is a C™ function near (0, 39), one shows that from expression 
in Eq. (2.17.7) and from estimates in Eq. (2.13.7) it follows that 


1 To q¥ Apad 
Ge n J 2an) e menr eT (2.18.16) 
((2ri/T)n)® Jo ar 


and, by Newton’s formula for the p-th derivative of a product: 


b = Bo — A+ 0(A) (2.18.15) 


OP Gin Soal A a9 ) dete 
O(T—-1)p — (2rin)* Jo eae 
Pp 
(—2pni)?-9(—k + 1)(—k)... (~k — j + 2) (T71) 7r+1 9; 

j=0 

(2.18.17) 
hence 
Pn (k + p (2n|n|Ty)?~4_ dg 

laa! S ossa | SSS T)| (2.18.18) 


which implies that as long as T < +00 (i.e., 1 + GA > 0), the function in Eq. 
(2.18.4) is a C® function of 8 and A. 

The last three relations also imply that the derivatives of with respect 
to A and P can be computed by term-by-term differentiation of the series 
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defining ® in the region AG > —4. After a brief computation, such a term-by- 
term differentiation evaluated at (0, Go) yields: 


ap eee 


2 
op m ) b(To*)? + (Ty *)? 
and this check of Eq. (2.18.14) concludes the discussion of the compatibility 
conditions showing that they can indeed be satisfied for small enough A. 


(2.18.19) 


2.19 Encore on Anchor Escapement: Stability of the 
Periodic motion 


In the preceding sections we showed that the anchor-escapement model [Eqs. 
(2.17.1), (2.17.2), and (2.17.4)] admits a periodic solution for small enough 
friction if the intensity of weight P is suitably chosen. 

Imagining to have fixed P conveniently in terms of A, such a periodic 
motion will be denoted t — Z(t), t € R}. However existence of the motion F 
is not interesting in itself for applications. In fact, to put the system in this 
state of motion, one would have to impress exactly the velocity vo at t = 0, 
with vo defined by Eq. (2.18.11) after putting the oscillator in xo = 0. In fact, 
these are the initial data of the periodic motion corresponding to the a priori 
given \ and P.'” 

The periodic motion studied in the preceding sections is interesting for 
applications only if it is “stable”, i.e., only if starting the system in an initial 
state 2(0) = 0,4(0) = vo + 7, perturbed with respect to that which would 
generate a periodic motion, would produce a motion t > 2,(t),t E€ R+ , 
according to Eqs. (2.17.1), (2.17.2), and (2.17.4), which exists and is unique, 
at least for small 7, and, furthermore, 


|En (t) — E(t — T)| zF l (2.19.1) 
if 7, is suitably chosen. 

In applications, one would like to require more: for instance, one would 
wish that the limit (2.19.1) is attained with an exponential speed with a 
halving time of the order of the period T of the periodic motion. In such a 
case, after a “few” oscillations, the motion would be identical to the rigorously 
periodic one, for all practical purposes. This is what actually occurs in the 
pendulum clock. 


12 One should also show that to such an initial datum an actually periodic motion does 
follow: i.e., one should prove a uniqueness theorem, at least for the initial data under 
examination. This is possible, as well as it is also possible to show a uniqueness property 
on the perturbed motions that will be met in this section. However, we shall not enter 
into the proof of the validity of the uniqueness properties that interest us: the reader 
should do this as a problem. Note that Proposition 1, p. 14, does not directly apply here, 
since the equations do not have the form contemplated in §2.2. 
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To examine the stability problem, the following proposition will be proved. 


26 Proposition. The periodic motion of the anchor-escapement model, t > 
T(t), t E Ry, built in §2.17 and §2.18, is stable in the sense expressed in Eq. 
(2.19.1) if X is small enough. The limit (2.19.1) is reached exponentially with 
a halving time Tıj2 of the order of magnitude 


Ty 2 = max(T, 2md7*) (2.19.2) 


Observations. 

(1) During the proof, the role of friction and its importance will clearly ap- 
pear. It is a rather general rule that the dissipative motions are more stable 
than the corresponding frictionless motions, as long as the friction is not too 
strong. The price paid for this stability, of obvious and essential importance 
in applications, is naturally the necessity of the action of a force to maintain 
the motion itself. 

(2) One could require, and prove, stability with respect to initial data that 
are more general and realistic than those considered in Proposition 26. For 
instance, with respect to initial data like z(0) = ¢,x2(0) = vo + n, the theory 
and results would be essentially the same. 

(3) Proposition 26 concludes our theory of the anchor escapement. One should 
clearly bear in mind that the mathematical equations (2.17.1), (2.17.2), and 
(2.17.4) are just a model, in some respects not very satisfactory. For instance, 
to independence of the force f(a%o,7), once to € [v_, v+], is unrealistic. 

(4) However, the model considered performs perfectly one of the typical func- 
tions of models and clarifies the possibility of the existence of an important 
mechanism which would also have to be present in more refined models: the 
possibility of a motion controlling itself via a feedback reaction inducing it to 
move periodically after a short while. This self-control, understood and practi- 
cally realized at a time when the field of mechanics was new, is a phenomenon 
which appears in many models concerning the most diverse physical systems. 
The design and construction of the most precise machines are based on it, as 
well as the very possibility of their existence. 


PROOF. Define [see Eq. (2.19.1) and the preceding lines for notation]: 


Enlt) = T(t) + E(t) (2.19.3) 
and let us show the existence of a C% solution of Eqs. (2.17.1), (2.17.2), and 
(2.17.4) verifying the initial conditions x, (0) = 0,4,(0) = (0) ++, provided 
7 is small enough and the values of P,A are such that the periodic motion 
t > Z(t), t > 0, exists. Call T the period of T. First, note that if t > &(t) is 
the solution of the equation 


më +A + KEL =0, &(0)=0, (0) =n, te Ry (2.19.4) 
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and if T4, is the first positive time when the motion t — T(t) + ĉ& (t) passes 
through the origin with positive speed, then the motion solves Eqs. (2.17.1), 
(2.17.2), and (2.17.4) for 7 € [0, T1] if 7 is small. To understand this property, 
note that the solution of Eq. (2.19.4) is 


_ sin /k/m— Xim t 
&lt) =ne ae aaa (2.19.5) 


hence, Vt > 0: 
[n] 
l inl i (2.19.6) 
Ol <4“ H 


y k/m — dA? /4m? ee 
Then, if |7| is small enough, to fix the ideas |7| < 6, with 6) suitably chosen, 
it is clear that Eq. (2.19.3) with &(t) replacing (t) verifies Eqs. (2.17.1), 
(2.17.2), and (2.17.4). 

To estimate a choice for 6), the following conditions must be imposed: 
(1) T, <Ti <T+a,V|n| < 6; 

(2) the velocity at the first passage through the origin is negative and at the 
second passage is positive. 

Such conditions are true for the reference motions 7 if À is small enough 
since, in such a case, as already mentioned and used in §2.18, the reference 
motion is almost a harmonic motion of period ~ Tọ for which the conditions 
under analysis manifestly hold. Therefore, by continuity, they must remain 
true for the motion t > Z(t) + €(t) if 7 is small. We leave the elaboration of 
the details to the reader. 

The fact that t — Z(t) + &(t) is a solution for t € [0,74] will not, in 
general, remain true for t > T4, because in Eq. (2.17.4) the time 7 is now 
counted beginning at T4, and T Æ T1, in general. 

To study the motions for times following T,, define 


nı “I5(T,) + & (Tı) > (0); (2.19.7) 


then if || < 6), we can define, as we already saw, the function t —> (a(t) — 
T(t) -T1) = & (t) where £9(t) is defined for t between T, and the first instant 


Tə successive to T1, when the motion sweeps through 0 with positive speed 
for the first time, as the solution of Eq. (2.19.4) with initial datum: 


E&i) =0, €(Ti) =m. (2.19.8) 


Repeating indefinitely the argument, it is possible to define 72,73, ... provided 
Ini] < ĝa, i = 1,2, ..., thus obtaining the definition of the times T1, T2, T3,... 
corresponding to the successive passages through 0 with positive speed. 
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The stability property asserted in the proposition will have been proven 
once will have been shown the existence of two constants cà > 0,0 <4, <1 
such that for all ņ, |n| < dy: 


Toi- O+S a, p=1,2.... (2.19.9) 


and 


In| < AX |n| (2.19.10) 


at least for small A. Setting To = 0, the constant 7, [see Eq. (2.19.1)] will 
then be 


Tm = > (Fi - (T +Tj-1)). (2.19.11) 
i=1 
Let us then show the validity of Eqs. (2.19.9) and (2.19.10). If To = 2r y m/k, 
the value for 6, that we shall find will have the form 


ba =e TE (140A). T= Z (2.19.12) 


for A small enough (note that 0, < 1 as soon as X is so small that 
(—ATo/2m + 4A?T/4m? + e7>To/2m o(A) < 0, as it is seen by expanding 
the exponential to second order). This will also prove Eq. (2.19.2) and, ne- 
glecting the infinitesimal o(\) in Eq. (2.19.12), it is seen that the larger the 
friction (compatibly with the supposed A? < 4mk and with the existence of 
T), the faster the motion tends to become periodic. 

To discuss Eqs. (2.19.9) and (2.19.10), one has to find a more concrete 
expression for T4, and, in general, for T;, i > 1. Let T4 = T+ 41: the equation 
for kı is 


x,(T +1) =0, (2.19.13) 


with the added condition that T + «xı should be the first positive time when 
the oscillator passes again through the origin with positive speed. 
For k,n) E€ R?, k > —T, define 


Yik, n) =T(T +6) +4(T +4) (2.19.14) 
and Eq. (2.19.13) becomes 


w(K, n) = 0. (2.19.15) 


Since, as seen in the preceding sections, t —> «(t),t E€ R4, is a C® function 
and, obviously, so is (7, t) —> Z(t), we can say that 7 is a C™ function on its 
domain of definition, k > —T. 

Furthermore, by Eqs. (2.19.14) and (2.19.5), it is 
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(0,0) =0, and wo, 0) = vo, 
Oy _ a psin(k/m — \2/4m?)?T (2.19.16) 
— (0,0) =e mw PE 
A (k/m — X2/4m?)? 


Then, by the implicit function theorem (see Appendix G), there is, for small 
n, a unique small solution of Eq. (2.19.15) which we denote «4(7), and 


ap e 2 2\4 
e` mm? sin(k/m — à? /4m?)3T 
k(n) =- ENE MTAA En + oln), 2.19.17 
1(n) w hma a(n) ( ) 
where the index \ in 0)(7) recalls that the infinitesimal depends also on A. 
By taking Eq. (2.18.13) into account: 


T = To (1 — BoA + o(à)), (2.19.18) 


and using sin 4/ Ë To = 0, one finds, with simple steps, from Eqs. (2.19.17) 
and (2.19.18), that 
T 
kı(n) = Polo’) a +oa(n)) (2.19.19) 


Vo 


where o' (A) is a suitable infinitesimal of order A?. 
It then becomes possible to compute nı: 
hy = F(T + k1) +& (T +41) — vo = F(T) e1 + &(T +61) +0,(«1) (2.19.20) 


where 0)(k1), is a A-dependent second-order infinitesimal: this expression 
arises just by expanding 7 in Taylor series near T and using Z(T) = vo. 


The equations of motion (2.17.1) imply that 7(T) = Z(0) = —Avo and 
Eq. (2.19.20) implies [via Eqs. (2.19.18) and (2.19.19) and some patience]: 
_ ATo P2 ~ 
m =ne ™ (1+0) + O,(n)), (2.19.21) 


where 6 is an infinitesimal of higher order in À while O,(7) is a \-dependent 
infinitesimal of the same order as 7. Therefore there is a 64 < 6) sufficiently 


small so that |O,(7)| < [B(A], V|n| < 64; then Eq. (2.19.21) implies that 


Im| < Aln, Yin < 6 (2.19.22) 


with 0), given by Eq. (2.19.12). 

Hence, if À is small enough one finds that |n| < 6) implies |7| < 6, and the 
argument can be indefinitely repeated to estimate successively |m], |72],...- 
Then from Eqs. (2.19.22) and (2.19.19), the Eqs. (2.19.9) and (2.19.10) follow, 
and Proposition 26 is thus proved. mbe 
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2.19.1 Problems 


1. Investigate heuristically the stability of the solutions of Eqs. (2.17.1), (2.17.2), and 
(2.17.4), using the computer program of problem 1, 82.17. For each value of A let vo, xo be 
the data, at time zero, of a periodic motion verifying Eqs. (2.17.1), (2.17.2) and (2.17.4); 
let the computer draw on the screen the graph of the periodic motion superimposed with 
the graph of the motion of a harmonic oscillator with the same mass and elastic constant 
(but no friction nor forcing term). Repeat this operation as à varies using it to compare 
visually the two motions. 


2. Same as Problem 1, replacing ka by ksinx in Eqs. (2.17.1) and (2.17.2) (i.e., replacing 


the basic oscillator with a pendulum). 


2.20 Frictionless Forced Oscillations: Quasi-Periodic 
Motions 


In §2.16 it has been shown that, under the action of a periodic force, a fric- 
tionless harmonic oscillator moves with a motion “sum” (or “superposition” ) 
of two periodic motions with respective periods equal to the proper oscillator 
period To and to the forcing term period T, provided T'/To is not an integer. 

The proposition in this section will help to visualize some remarkable prop- 
erties of such motions. One of them appears by representing them as motions 
on the data space (see §2.6), i.e., on the plane R? thought of as the space 
of the initial velocities and positions. This means that the motion t — x(t), 
t E€ R4, solution of Eq. (2.16.3), ie. of m+ ks = f, is represented by a 
curve t > (a(t), a(t)),t E€ R}. This is a representation of the motion which 
we have not yet used: it is somewhat redundant because once t — x(t) is 
given, its t-derivative is automatically given. On the other hand, every point 
of the curve t > (a(t), e(t)),t E€ R4}, completely determines the motion. Also 
it may sometimes be useful to know which are the pairs (4,2) which can 
appear during the evolution of a given motion. In such a case, this informa- 
tion can be directly extracted from the geometric locus described in R? by 
t > (a#(t), x(t)),t E Ry, without having to know explicitly which values of t 
correspond to the various points of the locus. 

Therefore, in the data space, a periodic motions appears as a closed curve. 
A motion like those met in $2.12, asymptotically periodic, appears as a curve 
spiraling around the closed curve representing the periodic motion and be- 
coming indefinitely closer to it. 

The structure of a superposition of two periodic motions in the data space 
representation is of particular interest: it is elucidated by the following well- 
known proposition (Euler theorem). 


27 Proposition. Let f,g E€ C®(R) be two periodic functions with minimal 
period 2r and let f’,g’ be their first derivatives. Given w,wo > 0, consider the 
motion in R? described by t > (n(t), (t)): 
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H(t) = wf (wt) + wog (wot), E(t) = f (wt) + g(wot) (2.20.1) 


Such a motion!’ is periodic if and only if w/wo is a rational number. If w/wo 
is irrational, the curve t — (n(t),&(t)), t > to, Vto E€ R4 densely fills the 
region Qg Fg: 


Nfa ={(n, £) |(n.€) E R? : 


in ee FAN Fay E ES EA © (Oa (2.20.2) 


Observations. 


(1) The region 2;,, can be easily visualized. Consider the curve Ip in the 
(n, €) plane, having equations 


n=wf'(a), E= f(a), a € [0,27] (2.20.3) 


By the periodicity of f, this is a closed curve [see Fig. 2.9]. Given a € [0, 27], 
consider the curve T4 (a) with equations 


n=wf (a) +wog' (8), E= f(a) +96), BE [0,27] (2.20.4) 


which, since g, too, is periodic, is a closed curve “around” (w f'(a), f(a)). 
As a varies in [0,27], the curve I,(a) “glides along f and “sweeps” the 

region (27,4. A simple case is illustrated in Fig. 2.9. 

(2) The relevance of this proposition for the harmonic non resonant forced 

oscillations is obvious after the discussion of §2.16 (see (iii) in 


Fig.2.9.: The region swept densely by the quasi periodic motion with irrational ratio of the 


periods is the region swept by the curve Ig(a) as a varies. 


Proposition 23). It shows that such oscillations, when T and To have an irra- 
tional ratio, are not periodic although they come back as close as desired to 
the initial datum, provided one waits long enough. 

(3) Also, for the purpose of future applications, it is interesting to give a 
geometric interpretation to Proposition 27 when w/woọ is irrational. In this 


13 Remark that n(t) = E(t). 
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case, the analytic expression of the trajectory density in (yg is: given o > 0 
and to E€ R4, for all (a, 8) € [0, 27], there is te (a, 3) > to to such that 


\(a — wto (a, B))mod 27| < g, |(8 — wot, (a, 8))mod 27| < a, (2.20.5) 


i.e., there are two integers m,(a, 3) and no(a, 3) such that 


la—wt,(a, 8)—207m,(a, B)| <a, |G—wote(a, B)—2Trno (a, B)| < o, (2.20.6) 


Now think of the plane R? as being paved with squares with side size 27 and 
with corners at (27r, 27s), r and s being integers. In this plane, consider the 
straight line through the origin with slope wo/w: 


y = wot, c= wt, tER (2.20.7) 


and the half-line corresponding to t > to. 

Next, identify the points of the plane whose coordinates differ by integer 
multiples of 27 (see Fig. 2.10). The just-described line can now be thought 
of as a set of segments in the square [0,2z]|?, where corresponding points 
on opposite sites are identified (topologically, we can say that we regard the 
square [0,27]? as a two dimensional torus). 


Padi 


a Figure 2.11 


Figure 2.10 y 
The figures represent a quasi periodic motion in the plane and its image on the torus 


Equation (2.20.6) says that at least one of the segments associated with 
the line of Eq. (2.20.7) with t > tp cuts the square neighborhood of side 20 
around (a, 8) (see Fig. 2.11). 

In other words, the half-line of Eq. (2.20.7) with t > to, brought back 
inside [0,27]? through the identification of the points of the plane mod 27 
(i.e., thought of as a coil around the torus) densely fills [0, 27]?. 


PROOF. If w/wo = To/T = p/q = (ratio of two relatively prime integers), 
where T 2n/w,To 2 2T/wo, then the motion of Eq. (2.20.1) is periodic 
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with period T’ = pT = qTy. As an exercise, the reader can show that in the 
geometric interpretation of Fig. 2.11, this means that the line becomes a finite 
set of segments (forming a closed curve if [0,27]? is thought of as a torus). 
Suppose now that w/w is irrational. Define for every integer n the number 
Tn as 
a+ 2rn 


Q— WT, +27 =0 — A (2.20.8) 


To check Eq. (2.20.5) and, therefore, the validity of the Proposition, it will 
suffice to show that given no € Z and ø > 0 arbitrarily, there exists n € 
Z, n > no, and m(n) € Z such that 


|B — woTn — 2xm(n)| < o. (2.20.9) 


It is useful for the reader to understand (along the lines of observation 3) the 
geometrical meaning of Eqs. (2.20.8) and (2.20.9) (exercise). 
By substituting m, given by Eq. (2.20.8), in Eq. (2.20.9) one finds 


18 — Za — 2r 2n — 2rm(n)| < v, (2.20.10) 
wW W 
i.e. setting yo = 8 — “2a: 
lpo — 2r £n — 2rm(n)| < o (2.20.11) 
Ww 


Eq. (2.20.11) has a geometric interpretation which is convenient to illustrate: 
consider the unit circle and its rotation R by an angle 0 = 27(wo/w) (see Fig. 
2.12). The point with angular coordinate 27(wo/w)n can be interpreted as the 
image of a point 0 on the circle under the action of the rotation R”, i.e., of n 
successive rotations R. If yo is also interpreted as a point on the circle, Eq. 
(2.20.11) means that the rotation R” brings the origin to an angular distance 
from yo less than ø. 

Then our problem is to show the existence, given o > 0, of infinitely many 
integers n > 0 such that the rotation R” brings 0 to an angular 


Figure 2.12 


distance of less than o from yo. In order to show this, it will be enough to 
show that there is % > 0 such that R” displaces the point 0 by a non vanishing 
quantity £ with modulus less than ø. In fact, when this happens, it is manifest 
that with a rotation R”, n = kn, k = 0,1,2,..., one successively displaces 0 
by ¢, 2e,3¢,...,ke, and therefore, sooner or later (and infinitely often), one 
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arrives at the situation that the origin falls inside a neighborhood of yo with 
angular amplitude ø. 

To show the existence of n, note that the sequence (27“¢k),ez_, thought 
of as a sequence of angular coordinates on the circle, corresponds to a family 
of points which are pairwise distinct since 


no ky = Inky + 2mp (2.20.12) 
Ww Ww 


with k2, ko, integers would imply, if kı Æ ke, that wo/w = u/(kı — ke) = 
(rational number). Then, if Y is an accumulation point of the above family of 
points on the circle, there must exist two distinct points in such a sequence 
closer to P than 0/2; i.e., there exist kı, k2 > 0 such that 


|(Qn2 ky — 2r kp) mod 27| < o, (2.20.13) 
W wW 


and this means that the rotation R™—* displaces the point 0 on the circle at 
a point whose angular distance from 0 is £ and 0 < |e| < ø [note that £ # 0, 
by the remark related to Eq. (2.20.12)]. Hence, one can take n = ky — kə if 
ky > kə or n= kə — ky if ko > ky. mbe 


2.20.1 Exercises and Problems 


Problems (1)-(9) are inspired by [26] and they aim at providing tools for 
studying the remaining problems. 


1. Let r > 0 be an irrational number represented by its continued fraction 


r = ao + ———,— = {0, 41, a2, . -} 


defined by setting [x] = (integral part of x) and ag = [r], rı = (r — ao)71, a1 = [ri], r2 = 
(r; —a1)~1+, a2 = [r2], etc. Show that a; > 0, Vj > 0. Compute ao, a1, a2,... for r = golden 
section = (v5 — 1)/2 (note that r = 1/(1 + r)),or r = (1 + V5) /2 (note that r = 1 + 1/r), 
or r = V2 (note that r = 1 + 1/(1 +r)), or r = (recall m = 3.141592653589 ... and using 
a pocket computer to find empirically (i.e., without error estimates) ao, a1,a2,...,ag, one 
finds ag = 3,a1 = 7,a2 = 15, a3 = 1,a=291,...). 


2. In the context of problem 1 let 


1 def 
Ry = a9 + —————,—. {a a1, a2,..., ak} 
a = E E 
ao e e 
agte yL 
ak 
Show that Ro, < r < Rəķ+1 for all k > 0. 
+ 
3. In the context of problems 1 and 2 note that if {a1,..., akp} = T then {a0,a1,..., ak} = 


+ 1 
mri, Deduce from this that a vector vk = (Pk, qk) € 22 such that Ry = ae can be 
taken 
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ao 1 ay 1 ak 1 1 
Vk = wee 
1 0 1 0 1 0 0 
4. Deduce from problem 3 that Vk = akVk—1 + Vk—2, ie. 


Pk = AkPk—1 + Pk-2, k>1 


dk = aklk-1 + qk-2, k>1 


tal (NSN cae) saat eecite 


nate the last matrix in the product of matrices appearing in problem 3). 


5. From the recursion relation in problem 4 deduce that 


dkPk—1 — Pedk—1 = —(4k—1Pk—2 — Pk-14k-2) = (-1)*, k>2 
dkPk—2 — Pkdk—2 = 4k (k—1Pk—2 — Pk—19k—2) = (—1)* "ak, k>2 
so that 7 P 
pkt Pr _ (-1) Pk-12 Pk _ (—1)*~ a 
de-1 qk Uk Ik—-1 dk-2 qk Ik Ik—-2 


(Hint: Multiply the first equation in the recursive formula in problem 4 by qķ—ı and the 
second by pk—ı and subtract, etc.) 


6. From problem (5) deduce that 


1 
Gk dk+1 


BU e cage i ee eee and |r — Œ| < 


qo q2 q3 qı dk 


7. Show that q, > 2(*-)/2, k > 0 and Pk 2 2(k-2)/2 k >1. (Hint: Note that ap > 1 for 
all k > 1 and use the recursive relation in problem 4 and pj, qj > 1.) 


8. Show that 


1 Pk 1 
grade Gals 
ak (dk + qk+1) dk Okdk+1 


(Hint: If $ < 4 then ates increases with s for s > 0, while if ¢ > 4 it decreases. Hence 
if k is even 227E Pk-1 
Wk—-2+8 45-1 


Pe <p < E., Therefore 
dk dk—1 


e 
d 
increases with s and for s = ax it becomes a which is such that 


hence 


Pk 2| > |B 2+ Pr-1  Pr—-2| _ 1 ) 
Gk-2+ Wk-1 qdk-2 Gk—2(dk—2 + qk—1) 


9. Show that the numbers pn, gn are relatively prime for all n. (Hint: this is obvious for 
po, qo; Suppose it is true for pk,qk, k = 0,1,2,...,n and remark that if p’,q’ are the 
last convergents of the continued fraction [a1,a2,...,@n] then they are by the inductive 
assumption relatively prime and pn = agp’ + q’, qn = p’; hence if j divided qn and pn, it 


would divide p’ and q’, against the assumption on p’, q’). 


The following definition will be used below: a rational number p/q is a best 
approximation forr if for any pair p',q' with q' < q it is |q'r—p'| > |r — pl. 
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10. Let p,q be positive integers and assume r irrational. Let j be odd, a = p/q, a; = p;/qj, 
and suppose that aj_1; <a < aj41; then q > qj. (Hint: aj_-1 > a > aj41 > r > Qj So 
that (qjqj-1)~* > laj-1 = r| > laj—1 — a| = |pj—19 — qj—-1P|/qqj-1 = 1/q9j-1, because 
|pj—-19 — 9j-1p| > 1). State and check the analogous result for j odd, showing that the two 
results can be summarized by saying that if p/q is between two convergents of orders j — 1 
and j + 1 then q > qj. 


11. In the context of problem (10) show that if a is not a convergent and aj_1 < a < aj41 
then qj |r — aj| < q|r — a|; a similar result holds for j even. (Hint: qla — r| > qla — aj+ı| = 
alpgj+1 — 9P3+11/99541 2 1/9341 2 gjlaj — rl). 


12. Show that problems (9),(10,(11) imply that if p/q us an approximation to r such that 
|q’r — p'| > |ar — p| for all q’ < q then q = qj, p = pj for some j. In other words every best 
approximant is a convergent. 


13. Show that if r is irrational every convergent is a best approximant. (Hint: if not it 
must be that that for some n there exists q < qn with |rq — p| < |rqn — pn| = En; 
let p,q minimize the expression |q’r — p'| for q! < qn; if = is the minimum value, it is 
= < En; hence p/q is a best approximation: so that p = ps,q = qs for some s < qn and 
1/(qs + q(s+1) < last — ps| < lant — pn| < 1/an41, ie. qs + qs+1 > dn+1 which contradicts 
Qn+1 = 4n419n + qn—1)- 


14. A necessary and sufficient condition in order that a rational approximation to an irra- 
tional number be a best approximation is that it is a convergent of the continued fraction 
of r. (Hint: just a summary of Problems (9)-(13)). 


15. Show that if qn—1 < q < qn then |qr — p| > |qn—17 — pn—1|. (Hint: if not and if 
= = min|gr — p| over qn—1 < q < qn and over p is reached at some g,p then p/q would 
be a best approximation). Show that this can be interpreted as saying that the graph of 
the function 7(q) = minp |gr — p| is above that of the function no(q) = En = |qnr — pn| for 
qn << q < qn+1- 


16. Suppose n even and think the interval [0,1] as a circle of radius 1/27: the point 
qnr mod 1 can be represented as a point displaced by En to the right of 0, while qn—1r can 
be viewed as a point to the left of 0 by E€n—1. Show that the points qr with qn < q < qn+1 
are not in the interval [0,€n—1] unless q/qn is an integer < a,41. Furthermore show that 
the point an+1qnr is closer than En to €n—1, and that the nezt position closest to 0 occurs 
when the rotation by (qnan+1 +qn—1)r = qn+1r is considered and it is to the left of 0 and 
at a distance €n+1 from it. Show that this provides a natural interpretation of the meaning 
of the numbers a; in the continued fraction of r regarded as a rotation of the circle (0, 1], 
as well as a geometric interpretation of the relation an+1qn + qn—1 = qn+1. 


17. Show that the function e(T) = maximum gap between points of the form nr mod 1,n = 
1,0,...,7 depends on T as 


Qn ST < qn + Qn-1 e(T) =€n-1 


qn +qn—1 < T < 2qn qn—1 e(T) = En 1 En 


(an+1 — lan < T < an+1qn +qn-1 = qn41 e(T) = €n-1 — (an+1 — len 


and apply this to draw the diagram of e(T) and its inverse T (e) for the golden number, i.e. 
the number with a; = 1. Plot —loge(T) in terms of log T). (Hint: this is simply another 
interpretation of problem (16)). 


18. Show that if the entries a; of the irrational number r are uniformly bounded by N then 
the growth of qn is bounded by an exponential (and one can estimate qn by a constant 
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times [(N + (N? + 4)!/2)/2]”). Vice-versa an exponential bound can hold if and only if the 
entries of the continued fraction are uniformly bounded. (Hint: it is bounded by the qn of 
the number with continued fraction with entries all equal to N). 


19. Show that if the inequality: |qnr — pn| > 1/C'qn holds for all n and for a suitable C 
then gn cannot grow faster than exponential. (Hint: Problem (8) implies the inequality 
1/Can < 1/4n+1-) 


20. Show that if a number has a continued fraction with entries which eventually are 
periodically repeated, then it is a number verifying a quadratic equation, i.e. it is a quadratic 
irrational. Vice-versa it can also be shown that all quadratic irrationals have a continued 
fraction with entries eventually periodic repeated. (See problems (21),(22) below). 


21. Suppose that for some integers a,b,c it is ar? + br +c = 0. Remark that the argu- 
ment in Problem (3) shows that the number rn = [an,@n41,...] verifies r = (pn—irn + 
Pn—2)/(Qn—1'n + Qn—2). Substituting the latter expression in the equation for r one finds 
that rn verifies an equation like Anr2 + Bnrn + Cn = 0. Check, by direct calculation of 
An, Bn, Cn that: 

An = ap? + bpn—-19n-1 + cga 


Cn = An-1 
BŽ — 4An Cy = b? — 4ac 


Show that |An|,|Bn|,|Cn| are uniformly bounded by H = 2(2|a|r + |b| + |a)) + |b|. (Hint: 
it suffices to find a bound for |An|. Write An = q2_,(a(pn—1/dn—-1)? + b(pn-1/4n-1) + c€) 
and use that |r — pn—1/dn—1| < 1/q2_1 and ar? + br + c = 0). 


22. Show that a quadratic irrational has an eventually periodic continued fraction because, 
as a consequence of the results of the previous problem, the numbers rn can only take 
finitely many values. Show that, if H is the constant introduced in problem (21), the period 
length can be bounded by 2(2H + 1)? and that the periodic part has to start from the j-th 
entry with j < 2(2H +1). 

23. Let w = {ao,a1,...,ax}, ai > 1,1 > 0, be a rational number and let w = (w,1). 
Consider the periodic motion on T? given by œo +t w. Estimate (from below) the maximum 
distance that a point can have from the trajectory of ag. 


24. Determine the region {2 densely covered by the data-space trajectory of the motion 
&+a = 3coswt, £(0) = x(0) = 0, when w is irrational. 


25. For w = golden section (see Problem (1)) estimate the minimum time 7 necessary for 
the trajectory of the oscillator in Problem (24) to cover 2 so that any point in N has a 
distance from the trajectory t > (a(t), x(t)),t € [0, T], not exceeding o = 2r /24. 


26. Same as Problem (25) for w = v2 and for w = 7; (use for m an empirically computed 
continued fraction; see Problem (1)). 


27. Let © = {ao,a1,...,a@%}, ai > 1, i > 0, be a rational number. In terms of k, estimate 
the maximum distance of a point in 2 from the trajectory of the oscillator in Problem (24) 


with © replacing w. 


2.21 Quasi-Periodic Functions. Multi Periodic Functions. 
Tori and the Multidimensional Fourier Theorem 


The considerations of 2.20 suggest the following definition. 
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11 Definition. A function f € C®(R) is “quasi-periodic with pulsations 
W1,...,Wq” if there exists a p € C® (R2) such that 


p(ai,..-,Qi,---,@a) = p(a1,...,a; + 27,..., Qa), (2.21.1) 


a E€ RI, i=1,2,...,d, and 


f(t) = lwit, ...,wat), tER. (2.21.2) 
The numbers Tı = 2r/w1,...,27/wa, are called the “periods” of f, while 
v =T7},..., Va = Tyt are the “frequencies” of f. 
Observations. 


(1) Therefore the motion of a harmonic oscillator with pulsation wo forced by 
a periodic force with pulsation w is, in absence of resonances, a quasi-periodic 
function with pulsations wo and w [see Eq. (2.16.6) and §2.20]. 

(2) The above definition of a quasi-periodic function is more restrictive than 
the one sometimes found in mathematical literature: it is, however, sufficiently 
general for our purposes. 

(3) It is useful to note that given f, there exist several choices of d,w ,...,wa 
and y allowing us to represent f as in Eq. (2.21.2). A trivial example is 
provided by the consideration of a function y E C™(R), periodic with period 
27, and of the functions of € € R or of (&,£) € R? defined as wW(E€) or 
W(E1, €2) = o(2&1 + 3€2) which, via the formulae 


f= (St) = y(wt), (2.21.3) 
f(t) = ot, =t) = (wt), (2.21.4) 


allow a representation of f as a quasi-periodic function with angular velocities 
w/2 or w or (w/4,w/6). 

(4) The pulsations (or “angular velocities”) in Definition 11 need not neces- 
sarily all be positive: some may be zero or negative. 


The functions y used to introduce the notion of quasi-periodic function 
are remarkable in themselves, and it is convenient to set up the following 
definition. 


12 Definition. Given Lı,..., La > 0, consider the pavement of RÌ whose 
tesserae are the parallelepiped [0, Lı] x [0, Lə] x ... x [0, La] and the paral- 


lelepipeds obtained by translating it by (niL1,...,naLa), m1,.-..Na integers. 
Two points £, n € RÌ will be declared equivalent if they are “equally located” 
in the pavements tesserae, i.e., if there are d integers n1,...,nq such that 


& —m = nili, i =1,...,d. Then T4(Ly,...,La) will denote the set of the 
equivalence classes thus obtained and a “distance” will be defined as 
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a({€},{0}) = min |e’ — n'l (2.21.5) 


n’e{n} 


if {E}, {n} € T4(Ly,...,La) and {€} denotes the equivalence class contain- 
ing £. The set T¢(L,,...,La), regarded as a metric space with the distance 
function defined by Eq. (2.21.5) (“distance on the torus”), will be called a 
“d-dimensional torus” with sides Ly,..., La. If Li = L2,... = La = 27, this 
torus will be denoted T, simply, and called “standard torus”. 


Observations. 


(1) The above definition, in spite of its apparent complexity, is simple and can 
be informally summarized by saying that the torus T4¢(L1,..., La) is obtained 
by “identifying the opposite sides” of the parallelepiped [0, L1] x ... x [0, Lal 
of RË. For this reason it is customary to describe points of T4(L1,..., La) 
through the Cartesian coordinates in R% of one of the corresponding repre- 
sentatives without explicit mention of the equivalence relation: when Lı = 
... = La = 27, such coordinates are called the “natural angular coordinates” 
or “flat coordinates” on 74. In general, the distance [Eq. (2.21.5)] is called 
the distance between £ and 7 on the torus T@(Ly,..., La). 

(2) Clearly T? can be regarded as the product of d unit circles. If (y1,..., a) 
are the natural angular coordinates of y € T4, a natural one-to-one continuous 
mapping of T? into S x Sx... x S, where S = (unit circle in the complex 
plane), is 


P = (¥1,---, Ya) —Z = (21,..-, 2a) = (e*2,..., e94) (2.21.6) 


and the distance (2.21.5) turns out to be equivalent to the distance on S x S x 
...x S as a subset of C?. Therefore, the d-dimensional torus T 4% can be regarded 
as a subset of the d-dimensional complex space Ct. This representation is more 
intrinsic since it does not involve coordinates defined mod 27. It will turn out 
to be a deep and very useful representation (see Chapter V, §5.10-5.12). 


13 Definition. C°(T4(L1,...,La)) is, by definition, the set of the functions 
f defined on T4(L1,...,La) such that setting (notations of Definition 12) 


FE = FUE, veer? (2.21.7) 


the function F is in C®(RÎ). The set of functions on R? having the form of 
Eq. (2.21.7) is the set of the “multi periodic functions on RÌ” with periods 
Diyas La 

When f has the form of Eq. (2.21.7) with f e C?(T4(Ly,...,La)), the same 
happens for the partial derivatives of f since the derivatives of a C% -multi 
periodic function are still multi periodic; i.e., given d nonnegative integers 
Ni,- Na, there is Pn... na € C~(T4d(Li,....La)) such that 


pores 


102 2 Qualitative Aspects of One-Dimensional Motion 


Qritectna f 
ee (6) = eee 2.21.8 
FET ge ) = Pme) (2.21.8) 


and it is natural to set 


Orit tnd f 


em, ena (8) = Ymr,...,na({€}) (2.21.9) 
Depending on the circumstances, it is possible to think or not to think of a 
Cù -multi periodic function with periods Lı,..., La and its partial derivatives 


as an element of C®(T4(Ii,..., La)). 
Observations. 


(1) Another natural definition of C*°(T%), for Lı =... = La = 27, could be 
related to observation (2) to Definition 12: one could say that f € C~(T%) 
if f(y) = F(z), where F is a C®-function on C44 and z is given by Eq. 
(2.21.6). This would in fact be an equivalent definition, as could be shown; 
see Problems (6)-(10) at the end of this section. 

(2) Along the same lines, after Definition 13, one can define the classes C% (V x 
T“), where V is an open set in R4, and the derivatives of their elements. One 
can also define W x T‘-valued functions in C% (V x 74), and their derivatives, 
as the C®(V x T?) functions with values in R8 x R! whose last £ components 
are thought of as angular coordinates on T’ (for W C R, V C R4 open sets). 


(3) An example of a multi periodic function on R? with periods z, Siart on is 
P 
the sum of the series 
Nj EZ 
GE ew ,€a) = 5 Cni, nge eE trawaga) (2.21.10) 
Ni,- Nda 
provided the coefficients Cn; ,...ng E C verify (“reality condition” ) 
Cni, na = C-ni za -na (2.21.11) 


and if, Ys = 0,1,..., there exists 7, > 0 such that (“regularity condition” ) 


(1+ |na|)*... (1 + [nal] |n, nal < Ys (2.21.12) 


The partial derivatives of f, in the sense of Definition 12, can be computed 
by series differentiation, as a result of Eq. (2.21.12). 


It is important to realize that, vice versa, Eqs. (2.21.10), (2.21.11), and 
(2.21.12) provide the most general example. This is essentially the content of 
the following proposition (“multidimensional Fourier theorem” ). 


14 A C®-function on C4 is a function on C4 which is C® in the real and imaginary parts 
of the coordinates. 
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28 Proposition. Let f is a C® -multi periodic function on RÌ with periods 
Lı,..., La > 0, then it is possible to represent f by formula (2.21.10) with 
oeenn Cni... na Verifying Eqs. (2.21.11) and (2.21.12) and given by 


pees 


P „=f $ = a = Ti T (2.21.13) 


aa Y f& Ar ĉa) 


where wj = 2T/ Lj, j =1,...,d. 

Observations. 

(1) If d = 1, this proposition coincides with the Fourier development theorem 
for periodic functions (see Proposition 19). 

(2) Since f(&1,...,€4) = f(&1/w1,...,Ea/wa) is multi periodic with periods 


27, it will suffice to prove the above proposition when w1 =... = wa = 1. 


PROOF (Case w1 = ... = wa = 1). The proof can be developed by induction. 
For d = 1 it holds (see Proposition 19, §2.12); hence assume its validity for 
d=1,2,...,k and consider the case d = k + 1. 

Let f € C®(TE+!) and contemplate the function Ype, 41> parameterized by 
Ek+1 € R and defined on R*: 


Very (1; SE ,&k) > FE, i Ratna fas (2.21.14) 


which, V £41 € R, is a C®-27-multi periodic function on R*. The inductive 
hypothesis implies 


Fei. eee , Šk, Ek+1) = 5 TAT (Ex41)e (naéit...trnge) (2.21.15) 
with 


A T Be ood, 
T (Geri) = f ATH ty abn farde met tm 


(2.21.16) 
On the other hand, Eq. (2.21.16) immediately implies that qn,,...n,(€k-+1) is 
a C™-function, periodic with period 27, of +1, for all choices of (n1,..., nx) 
€ ZF. Via the Fourier theorem for d = 1 it follows 


; dé! 
Yni ies nz (Ek+1) 5 S emea f" Vn EN PE (EJee a (2.21.17) 
i i 7 


Ne41EZ 


i.e., using Eq. (2.21.13) as definition of Cn... ngs, 
in the right-hand side of Eq. (2.21.17), we find 


and inserting Eq. (2.21.16) 
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Wnr,....nz(€k41) = >D E PE a (2.21.18) 


Substituting Eq. (2.21.18) into Eq. (2.21.15), one obtains Eq. (2.21.10), pro- 
vided Eq. (2.21.12) holds (which implies that the series on nk+}1, and on 
N1,..-. Nk can be unconditionally summed and interchanged since they are 
absolutely convergent). 

It is therefore necessary to check Eq. (2.21.12) in order to complete the 
proof. In fact, Eq. (2.21.11) follows from Eq. (2.21.13) which has now become, 
temporarily, a definition of €n,,...n,,,;- One can proceed as in the analogous 
situation met in the one-dimensional case: one integrates Eq. (2.21.13) by 
parts. By integrating o times with respect to €; by parts, we find, if nj Æ 0: 


1 a dEr -deka OFF (Sr++ Sk Sktl) iY nrt 
0 


(2.21.19) 
and, if F} = maxe j | $74(€)|, this yields 
F' 
n nea = sae 2.21.20 
Bae (2.21.20) 
For n; = 0, from Eq. (2.21.13), and Y n1,...,nk+1, (zero or not), Cni, npp IS 


bounded by the maximum F% of |f|, Eq. (2.21.20) implies the existence of some 
F, > 0 such that for Vj = 1,...,k +1, Vo € Z4, V(ni,..., Mea) E ZETE: 


a 
PEHE S (1 ar 
take, for instance, Fo = F + FL). Hence, multiplying Eq. (2.21.21) on j as j 

(0) o 


varies between 1 and k + 1 and then taking the (k + 1)-th root, side by side, 
of the result, it is 


(2.21.21) 


| I< e 

Cn peaeglt Sy re DN oe fa ee OM eee A 
ee N+ ||) (L [neea] 

implying Eq. (2.21.12) by the arbitrariness of ø > 0. mbe 


(2.21.22) 


It is useful to explicitly state the following obvious corollary of Proposition 
28 and Definition 11. 


29 Corollary. If f is a C°-quasi-periodic function with pulsations wy,... 
wa > 0, then it admits a representation of the type 


? 


JOSA eer (2.21.23) 


nezd 


where w = (w1,...,wa), n = (N1,..., Na), and the constants Cni... na verify 


Eqs. (2.21.11) and (2.21.12). 
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It is remarkable that in some cases, given w),...,wa, the representation 
Eq. (2.21.23) is unique. 


30 Proposition. Let f € C®(R) be quasi periodic with pulsations w1,...,Wa 
> 0. If the pulsations are rationally independent,' the coefficients of the rep- 
resentation (2.21.23) are given by 


t 
Cn = lim an e eT f(r) dr (2.21.24) 

t—+00 0 
and, therefore, the representation (2.21.23) is unique, given w = (w1,...,Wa). 


PrRooF. Taking into account the decay properties of Cn as n — co expressed 
by Eq. (2.21.12), the integral in Eq. (2.21.24) can be computed via the series 
in Eq. (2.21.23): 


t t 
mf ee Pt f(r)dr = os aa) iy age (2.21.25) 
0 0 


me Z4 


The right-hand integral divided by t has a modulus < 1 (as an average of a 
function with modulus 1). Therefore, Eq. (2.21.12) shows that the series in 
Eq. (2.21.25) is a series uniformly convergent with respect to t and that the 
limit of Eq. (2.21.24) can be computed in Eq. (2.21.25) by interchanging it 
with the series. If n Æ m, the integral in Eq. (2.21.25) is 


= (2.21.26) 


because (n — m) - w #0 by the rational independence assumption on w. 
Hence all the terms in Eq. (2.21.25) with #m do not contribute to the 

limit, as t —> +oo, of Eq. (2.21.15) itself. The term with n = m, on the other 

hand, only contributes cy; hence, the proposition is proved. mbe 


For the sake of completeness, we also wonder about what can be said in 
the other cases when w),...,wq are rationally dependent. As an example, the 
following proposition will be discussed, 


31 Proposition. Let f € C®(R) be quasi-periodic with rationally dependent 


pulsations Wy ,...,wq > 0. There exist p < d and p rationally independent 
numbers 01,...,Wp and a multi periodic function p E€ C°(T?) such that 

f(t) =GGit,...,0,), YEER. (2.21.27) 

15 A family 2 = (w1,w2,...) of real numbers is said to consist of rationally indepen- 

dent numbers when every finite subset (w;,,... Win) has the property that the relation 


Ded NkWjp = 0, with n1,...,Np integers, implies nı = ... = np = 0. 
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Observation. Therefore, if w1,...,wq are rationally dependent, it is possible 
to reduce the complexity of the representation Eq. (2.21.23) by reducing the 
dimension of the multiple series appearing in it. 


PROOF. Consider all the subsets of w1,...,wq built with rationally indepen- 
dent numbers and let (@1,...,@,) be a maximal one among them (i.e., such 
that (W1, ..., Wp, w) is not built with rationally independent numbers no mat- 
ter which w’ € (w1,...,wa) is chosen). 

Without loss of generality, suppose wı = @,...,Wp) = Wp: then for every 


j=ptl,...,d, there are p rational numbers r®), sa TY), all with the same 
denominator N, as it can and and shall be assumed, such that 


P 
w= Pus jet... (2.21.28) 
k=1 


Hence, setting ro = m /N, mË integer, j = p+ 1,...,d, k = 1,...,p, and 
wjwj/N, we see that 


P . 
z DO mng Dr, j=p+1,...,d. (2.21.29) 


defining mË SN ðjk for j < p. Now make use of Proposition 29 to get 


HO = F em, mint 


N1,---,Nd 
DD P mJ 
= X Cm, ng Zor "h O pmi h wr) t 
NI,- Nda 
5 ; iy? w a (t) 2.21.30 
= Cy ong ei aber Dine k mn)t ( ) 
N1,- Nda 
t 
= X ( X Cny,..., na)e D MARAR 
qis qp Pa 
Dnr "A=4k 
Therefore, we set 
Cai ,---5Ep = X Cni,... na) (2.21.31) 
niona 


and, from Eq. (2.21.11), t is seen that Cq,...qp = €-q,....-qp- Furthermore, 


yaki 


since |q;| < MO |ng|) with M = maxy,,; im” | > 1, we see that 
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(1+ lal)®-..(1+ lal) *leas.....a9| 
<MP SS (Hla? ... (1+ lap)? lcn.. na (2.21.32) 


Ng 


ane 


The series on the right-hand side of Eq. (2.21.32) can be bounded, with the 
help of Eq. (2.21.12), as 


so Getler. Atlante FS 2a, 
moog OF laa. 0+ gr? S S 2 Tee? Pt? 
Dpr Enza 

(2.21.33) 


Hence, Eqs. (2.21.32) and (2.21.33) mean that the constants c” verify an 
inequality like Eq. (2.21.12) (with p instead of d) and the proposition is now 
proved since, by Eq. (2.21.30), we can define 


Pleni) = SD Eagt mE (2.21.34) 
q1,- 4p 
mbe 


2.21.1 Exercises and Problems 


1. Compute the Fourier coefficients Po,o, Po, fio of the function f(€1,€2) = 1— + (cos + 
cos €2)~+ with an approximation of 50%. 


2. Same as Problem 1 for f(€1,€2) = 1 — log(cos € + cos £2). 
3. Show that if f(€1,€2) = S372 947 "Cx (cosé1 + cos€2)* with |Ck| < D, there exist 


C > 0,¢>0 such that |fn; na| < Ce7el™1 141721), Estimate C and e in terms of D. 


4. If w1/we is irrational, show that, for Vp € C° (T?), the closure of the set of the values 
taken as t E€ Ri, by f(t) = y(wit, wet) coincides with y(T?) = y-image of T?. (Hint:: See 
Proposition 27.) 

5.* Same as Problem 4 when y € C%(T%), f(t) = y(wit,...,wgt) and wi,...,wg are 
rationally independent. 

6. On the complex plane C/{0}, define the function I(z) = et? if z = oet? £0, 0 > 0,pER. 
Show that J is a C% function of Rez = x and Tmz = y. 


7. In the context of Problem 6, show that 


ak I(2”) 


Braye | $ "Ck 


For a suitably chosen Cx, for all z such that 4 <|z| <2. 


8. If f € C®(R) and f is 27-periodic and if Ta are the Fourier coefficients of f, consider 
the function of z = x + iy, x,y € R defined for 4 < |z| < 2 and by 
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+o 
Fiz)= So fhe”) 


n=—oo 


Using Problem 7, show that F, as a function of x,y, is C% in the region p < |z| < 2 and 
on the unit circle coincides with f(y) = F (et?). 


9. Using Problem 8, show the validity of the equivalence claimed in observation (1) to 
Definition 13, p. 102, in the case d = 1. 


10. Same as Problem 9 in the case d > 1. (Hint:: If f € C% (T%) and if Tract bing are its 
Fourier coefficients, let z = (z1,..., 2a) E€ C? and 


then show that F is a C° function of x; = Rez; and y; = Imz, i = 1,...,d, ina 
neighborhood of the torus S x ... x S where S = {unit circle in C} identified with 74 via 
Eq. (2.21.6).) 


2.22 Observables and Their Time Averages 


Observables and time averages play an important role in qualitative as well 
as quantitative developments in Mechanics. It is therefore useful to look more 
closely at them. For the purpose, consider an autonomous differential equation 


më = f(t, x), (2.22.1) 


where (7,€) > f(n,€) is in C®(R?) and m > 0. Suppose, also, that Eq. 
(2.22.1) is normal. According to Definition 7, we shall denote by (Si)ier,, 
the flow which solves Eq. (2.22.1); i.e., if (n, £) € R?, the function 


t> (EH, eH) = Si, E), tERy (2.22.2) 


will be such that t > x(t), t E€ R4, is a solution of Eq. (2.22.1) with initial 
datum (n, £). Recall, also, that the map defined on R4 x R x R and with 
values in R x R is a C% map and 


Sere = St Sv, Yt, t € R4; (2.22.3) 
see Corollary 9. In the above context, introduce the following concepts. 


14 Definition. The set of C® functions on R?, thought of as the space of the 
initial data of Eq. (2.22.1), will be called the set of instantaneous “observables” 
for the point mass described by Eq. (2.22.1). 

Ift > Si(n,€), t > 0, is a motion of Eq. (2.22.1) and if F is an observable, 
let the “value” of the observable at time t E R} on the motion with initial 
datum (n, £) be 


F(è(t), x(t) = F(Si(n, €)). (2.22.4) 
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The function t > F(S;(n,€)), t € Ry, is the “history” of the observable F on 
the motion with data (n, £). 


Observations. 

(1) The motivation for the above terminology is clear. What perhaps needs 
a few words of comment is why one defines an observable as a function of 
velocity and position only, see Eq. (2.22.4), rather than, more generally, as a 
function of acceleration and higher derivatives as well. 

Actually, such a definition would not be more general since, via Eq. (2.22.1) 
and by what it has been observed in §2.4, it is possible to compute all the 
derivatives of t — x(t) successive to the first by repeatedly differentiating 
both sides of Eq. (2.22.1), once x(t) and «(t) are known. 

(2) Therefore, the observables correspond to physical entities measurable by 
observing velocity and position of the point mass at a given instant: they are 
a mathematical model of such entities. 


Given an observable F and a motion t > S;(n,€), t E€ R+, one can raise 
several questions about the observations of F at various times. As an example, 
the notion of average value of an observable on a given motion will be discussed 
below. 

It is important to remember and to stress that, concerning the notion of 
the average value of an observable, it is possible to repeat what has already 
been said about the notion of the stability of equilibrium. It makes no sense 
to provide an absolute definition of average value of an observable as time 
elapses. In fact, it is possible to give several meanings to this concept, each 
corresponding to different needs that may naturally emerge in applications. 

Here and in the following sections, only a few interesting examples of 
definition of time averages will be examined, leaving it to the reader to imagine 
applications in which a particular definition may appear as a relevant one. 
The reader should also try to imagine other definitions and the corresponding 
situations to which they could naturally apply: the methods explained below 
could then be used to elucidate their properties. 


15 Definition. Let F € C®(R?) be an observable for the motions described 
by (2.22.1) and let T > 0. We define the continuous average value of F on 
the motion with initial datum (n, €) € R? and on the time interval [0, T] as 


1 T 
Mr(Fin.g) =z f, FSE dt (2.22.5) 


The “continuous average value” of F on the motion with initial datum (n, £) 
will be, whenever it exists, the limit 


F(n, £) = „lim Mr(F;n, 8). (2.22.6) 


Similarly, one could define the average value with observation step a > 0: 
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16 Definition. If F € C®(R?) is an observable for the motions described 
by Eq. (2.22.1) and if N is a positive integer, the discrete average value with 
time step a of F on the motion with initial datum (n,&) and relative to N 
observations, is defined as 


N-1 

F 1 

MO (F; n, £) = wd Fl Sya(n, €)). (2.22.7) 
= 


The “discrete average value” with step a of F on the motion t > Si(n, €), t € 
R, is defined by the limit, whenever it exists, 


(a) . a 
F”, E= lim MẸ (Fin, E) (2.22.8) 


Why should one refrain from considering a more general notion? 


17 Definition. If p € C®(R,) and if T >0,N € Z4,a >O let 


1 T 
Mri) =F fea, p= lim Mr(y) 

j N (2.22.9) 
MO == yla), = BF = lim MO() 

N j=0 N—--+00 


whenever the limits exist. The quantities defined in Eq. (2.22.9) will be called 
the “continuous average of p on [0,T]”, the “continuous average of yp”, the 
“discrete average of p on N observations with time step a”, and the “discrete 
average of p with time step a”. 


Observations. 

(1) If y is constant, Y = G = y. 

(2) If \ = lim; +o y(t) exists, then Y = GB = A: in fact, note that Mr(p)— 
àA = Mr(ọ — A) and if Te is such that, Vt > Te, |e(t) — A| < £, one has 


Te T 
Mr(y— A) = >J (p(T) — A) dr + >I (p(T) — A) dr (2.22.10) 
T Jo T Jr, 
and the first term in the right-hand side of Eq. (2.22.10) goes to zero as T > 
oo, while the second is bounded by T~!(T'—T:)e < £. Hence limp... Mr(y— 
A) = 0 by the arbitrariness of £, and Y = A. Similarly, one checks that go = À. 
(3) If p € C®(R) is periodic with period T, > 0, 


To 
jim Mr(y) = 9 = + y(r)dr. (2.22.11) 
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In fact, if T = nT,,+6 with n integer and 0 € [0, Tọ], it follows that T — 
con —> œ and 


T 0 
Mr(y) = ef orars | p(T) dr) (2.22.12) 


nT +n 
implying Eq. (2.22.11). 

(4) If y € C®(R) is periodic with period Ty > 0 and if a > 0 is such that 
T,/a = p/q with p and q relatively prime integers (i.e., if T,/a is rational), it 
follows that 


+oo p-l 
n ps 1 : 
P= Y, Gmp, and p” rs > 9(sa) (2.22.13) 
m=—oco j=0 


where @, are the harmonics of ¢ relative to the period Tọ. The first relation 
in Eq. (2.22.13) can be proved as in (3) above. To prove the second, note that 


1 N-1 1 N-1 zi 
MEO =F s D et” 
pen bis: (2.22.14) 
N-1 
ie 1 2zi ja 
=) (F Span) 
N 4 
nEZ j=0 


and the term in brackets has modulus < 1 (as an average of numbers with 
modulus not exceeding 1). Hence, the series in Eq. (2.22.14) is uniformly 
convergent in N and the limit as N — oo can be taken term by term. As 
already remarked (see Eq. (2.21.26)) if e?"'"¢/7 Æ 1, one finds 


N-1 2rinz N 


1 2ming j_ le oR sal 
Nat >in, Wore’ (2.22.15) 
j=0 S ds 
while if e?"i”a/Te = 1, i.e., if na/T, is an integer (i.e., n = mp for some 


m € Z), the sum (2.22.15) is clearly 1, identically, Y N. Hence, by taking the 
limit as N —> oo in Eq. (2.22.14), Eq. (2.22.13) follows. 
(5) If p € C®(R) is T,-periodic, To > 0, and if T,,/a is irrational, then 


1 f” 
Oe =| ade (2.22.16) 
To Jo 


This is true because, in the present case, in the series (2.22.14), all the terms 
tend to zero except the one with n = 0 (as exp(2mina/T,,) Æ 1, Vn £ 0 [see, 
also, Eq. (2.22.15)]). 

(6) If p € C®(R) is periodic with period Tọ > 0, let a > 0 vary so that T,,/a 
is rational, but if T,/a = p/q, with p and q relatively prime integers, then 
p — oo.' Then it follows from Eq. (2.22.13) and from the decay 112 properties 


16 The number p measures the number of times it is necessary to repeat a to reach a multiple 
of Tọ, i.e. it measures the “commensurability” of Ty with respect to a. 
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of the Fourier coefficients of of the Fourier coefficients of y that Bl > @. 
Hence, the “less Tp is commensurable with a”, the closer the discrete average 
p is to the continuous average D. 


The following proposition is a consequence of the above remarks and an 
example of questions related to the corresponding definitions, 


32 Proposition. Let V e C®(R) be bounded below and consider the motions 
associated with Eq. (2.22.17): 


m#(t) = -Z al), tE R4. (2.22.17) 


If F is an observable “with bounded support” (i.e., if F(n,€) = 0 when |n| + 
|f| is large enough), every initial datum (n,€) € R? gives rise to a motion 
on which both the continuous and the discrete averages with step a > 0 are 
defined. 

If lime.4oo V (E) = +00, every observable (whether with bounded support or 
not) has well-defined average values, continuous and discrete. In this case the 
continuous and discrete averages with step a > 0 coincide on all motions, with 
the possible exception of the periodic motions with period commensurable with 
a. 


PROOF. From Proposition 11, p. 37, it follows that the motions described by 
Eq. (2.22.17) either approach infinity or tend toward a well-defined limit (i.e., 
limt+o0 Si: (7, E) = (0, €0)) or are periodic. 

In the first two cases, the above proposition follows from observation 2 to 
Definition 17, while in the third case, it follows from Observations 3 and 4. 
The assumption on the support of F is needed to deal with the case when 
Sin, E) — oo : this case cannot occur, according to the law of conservation of 
energy, when V diverges at infinity; hence, in this case, no restriction on F is 
necessary. mbe 


2.22.1 Exercises and Problems 


1. Compute the continuous average along the motions %+ «x = 0, x(0) = 0, and «(0) = 1 of 
the kinetic energy and of the squared elongation (Le., of the observables f(n,€) = in? or 


gë, n) = €?). 


2. Compute the difference between the continuous average of the kinetic energy and that of 
the potential energy in the oscillations of m% = —kx with energy E. Compute their values 
as functions of E. 


3. Compute the discrete average of the kinetic energy for the motion % + x = 0, x(0) = 


7 — = 17 
0, #(0) = 1 for a = 27, 47, 5,1, 2, 73° 


4. Same as Problem 1 for the motion ž + sin x = 0, x(0) = 0, «(0) 3 with 60% accuracy. 


5. Same as Problem 3 for the motion in Problem 4 with 60% accuracy. 


6. Same as Problems 4 and 5 with 1% accuracy (using a computer). 
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7. Same as Problem 1 for the motion in Problem 4. Estimate the accuracy needed in the 
computations to see a difference between the linear-oscillator and pendulum results. 


8. Compute the average value of the elongation, and of the square of the elongation, in the 
motion & +2 = cost, x(0) = 0, (0) = 0 in the continuous case and in the discrete case 
with step a= 7, i, 2: 


9. Show that if p, Y E CY (R) and limz—.+.0 |y(t) — Y()| = 0 and if y has an average value 
of any type, then ~ has the same average value. 


10. Apply Problem 9 to calculate the continuous average of the squared elongation in the 
motion of the oscillator + «+a = cost, x(0) = 0, (0) = 0. How does this average change 
by changing the initial datum? (Answer: It does not change.) 


11. Define work per unit time of a force f on a point with velocity v the quantity vu - f: 
see p.144 for the general definition. Arbitrarily choose a definition of average and estimate 
the average work done by the friction force (“dissipation per unit time”, i.e., average of the 
observable. w(n, £) = —n?) in the motions of the oscillator in Problem 10. 


12. In the context of Problem 11, compare the average work per unit time done by the 
friction force and that done by the forcing force. Interpret the value of their difference. 


13. Compute, in general, the continuous average value of the work done by the forcing 
force and by the friction force in the motions of the oscillators më + Av + kx = f(t) with 
m,A,k > 0 and f(t) = Fcoswt, F,w € R. Also compute the continuous average value of 
the potential or kinetic energy. 


14.* Same as Problem 13 but with a generic 27/w-periodic C% forcing force f. Express 
the results by means of the harmonics of f and of the parameters m, A, k. 


15. In the context of Problem 13, find the value of w to which corresponds maximum average 
work done by the forcing term (“resonant pulsation” ). 


16.* If f € CR is a quasi-periodic function in the sense of Definition 11, then the average 
values of f exist both in the continuous and the discrete sense. Find expressions for such 
quantities and show that if the pulsations of f are wi,...,wq and if {w1,...,wg,27/a} are 
(d+1) rationally independent numbers, then the discrete average of f with step a > 0 and 
the continuous average of f coincide. (Hint:: Use the representation of Eq. (2.21.23) and 
proceed as in Observation 4, Eq. (2.22.14).) 


17. Find an example of a function in C%(R) which does not have a continuous average. 


18. Estimate within 60% the average kinetic energy in the motion with energy E = 10 of 
the oscillator #+ 23 = 


19.* Same as Problem 18 with 1% accuracy (using a computer). 


20.* Show that if a potential energy produces periodic motions with period T(E) which, 
as E varies in [Eo, E1], is such that T’(E) > 0, then the discrete average with step a = 1 
and the continuous average of an arbitrary observable coincide for a dense set of values of 
R € [Eo, E1], while they do not coincide, in general, on another dense set. The second set 
is, however, denumerable. (Hint:: By the implicit functions theorem, deduce that T(E)/a 
is irrational for all but countably many values of E € [Eo, F1]). 


21. Show that the same results of Problem 20 hold if T(E) is strictly monotonically in- 
creasing in [Eo, Ei]. They also hold if T'(E) = 0 only finitely many times in [Eo, E1]. 


114 2 Qualitative Aspects of One-Dimensional Motion 


2.23 Time Averages on Sequences of Times known up to 
Errors. Probability and Stochastic Phenomena 


... or mi di’ anche: 
Questa Fortuna di che tu mi tocche, 
Che è, che i ben del mondo ha si tra branche? 17 


The continuous averages as well as the step-a discrete averages are, as is 
easily understood, very idealized mathematical notions, even when T or N 
are < +00. To be really measured, the continuous averages would demand 
an infinity of measurements of f, one per each time, and there is no need to 
underline the degree of abstraction that must be assumed in order to imagine 
such a sequence of measurements. 

Only at first sight are the discrete averages “more concrete“ notions. It 
is in fact unthinkable to be able to perform measurements at time intervals 
exactly equal to a, because of the unavoidable errors of time measurement. 

Obviously, considerations of measurement errors could have been brought 
up in correspondence with almost every question studied so far or it could be 
brought up in correspondence with any future question. Arbitrarily, we decide 
to discuss it now in connection with the analysis of the averages of functions 
or observables. 

The methods and ideas involved in the effort of making precise the notion 
of error in the time average computations present the greatest interest and 
are very general: they could be applied to the consideration of errors in the 
context of other problems, and the reader could try some of these applications 
by himself. 

A very naive schematization of the data accumulation process for calculat- 
inb an average is the following: one measures!® f, the function that we want to 
average, at the initial time 7) ~ 0; then we wait a time interval 7, ~ a and re- 
peat, again, the measurement of f, and subsequently the operation is repeated 
after waiting times 7),73,... etc: every 7;, i = 1,2,...is approximately equal 
to a, though not exactly because of the errors made in the measurement of the 
time intervals. Afterwards, the average of f will be defined as the “average of 
the results thus obtained”. Such an average, instead of being 


‘ 


MY (Ff), will be (2.23.1) 
1 N-1 
Int ay 2 fit +nat+...+7;) (2.23.2) 
J= 


17 In basic English: 
.. now tell me also: 
This Fortune of whom you speak 
What is she, that the world’s goods holds so firmly in her hands? 
(Dante, Inferno, Canto VII). 


18 For the sake of simplicity, ability to perform exact measurements of f will be supposed 
so that the only source of error comes from the measurement of the time intervals. 


2.23 Averages. Errors and Probability 115 


Time measurement errors will be further idealized by imagining that 


To=60, 1=a+41, ¢ 1 250 es (2.23.3) 


and £j = +e with e > 0 fixed, € < a, and the sign of £; is “randomly chosen”. 

One can think of a simple mechanism producing a sequence of errors like 
those in Eq. (2.23.3). Assume that to be also able to perform perfect time 
measurements, but to proceed deliberately as follows: at the initial time toss 
a coin and perform a measurement of f at time To = £o, where €9 = € if the 
result is “heads”, while €o = —e if the result is “tails”. 

At time 7) we again toss the coin and perform the measurement of f at 
time To + 7, where 7, = a + £1 and £1 = e according to the result,’ etc. 

One can debate at length on which would be the best mathematical model 
allowing a satisfactory translation into mathematically clear terms of the just- 
described sequence of operations. The most interesting mathematical scheme 
is based on the notion of probability. 


18 Definition. Let E be a finite set of elements which will be called “possible 
events”. On E, let p be a function on E with p(e) > 0 such that 


S"p(e) = 1. (2.23.4) 
ecE 
The pair (E,p) will be called a “probability distribution” on E. If ACE is a 
subset of E, we set 


p(A) = >> ple) (2.23.5) 


ecA 


and we say that p(A) is the probability of A with respect to the distribution 
(E€, p). 


The above notion of probability is precise from a mathematical point of 
view, but its connection with reality is far less evident. A relation between 
this definition and the empirical world cannot be established on a deductive 
basis in the same way as it is not possible to establish deductively the relation 
between solutions to differential equations and motions of point masses. 

The theory of a point mass motion, if identified with the theory of a class 
of differential equations, appears to us as natural only after long practice 
and experience in comparing the relations between the mathematical model 
and the corresponding empirical, i.e., experimental, properties of “real” point 
masses. In this comparison, one refines both the mathematical intuition on 
the structure of the solutions of some differential equations and the physical 
intuition about the nature of motion. 

Even a superficial knowledge of the theory of differential equations has the 
consequence that one cannot avoid observing motions, perhaps unconsciously, 


19 In other words, instead of leaving the “coin tossing” to the measurement instruments, 
we “do it ourselves”. 
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more and more closely to unveil in them those properties which are suggested 
by their analytical model as solutions of a differential equation. 

Similarly, the notion of probability allows the formulation of mathematical 
models of stochastic (i.e. random) phenomena and the quantitative evaluation 
of the probability of classes of events, reaching results such as “that class of 
events has large probability” or “probability H”, etc. In terms of empirical 
interpretation, the meaning to attribute to such results becomes clearer and 
more refined while one proceeds in the applications, and this allows us to 
think of them again in more intuitive terms, more immediately expressible in 
an empirical language and in empirical prescriptions. 

The key to the empirical interpretation of the notion of probability is 
the following: consider a “stochastic phenomenon” developing “following the 
judgement of Her, which is as hidden as a snake in the grass” ,?° which we 
imagine “reproducible” and whose possible events form a certain set €. To say 
that a mathematical model for such a phenomenon is given by the probability 
distribution (€,p) means to formulate a law (on an empirical basis) stating 
that the number of times that in “n trials”, or “repetitions of the production of 
the event”, the event e € £?! will happen about p(e) n times, if n is large, and 
the deviations from this value are very small, < p(e) n, except in “particularly 
unlucky” situations which can be disregarded “for all practical purposes”. 

One can wonder about what could be the predictive power of such a law. 
This power, in fact, is enormous when it is formulated a priori, i.e., without 
having first measured the occurrence frequencies of every event of E€ over a 
large number of “trials”. The laws of dynamics have the same extent of power 
when they are applied to cases to which they are believed to be applicable, 
but for which the actual applicability has not been checked a priori and will 
be checked only a posteriori (think of the microscopic theory of gases, or of 
the planetary system theory). 

Obviously sometimes a formulated law may be wrong, i.e., the distribution 
(E, p) may not be a good model of the stochastic phenomenon in the preceding 
sense. This may happen for two reasons. 

First, the phenomenon may be stochastic but the empirical law on the 
existence of a well-defined frequency of realization of every possible event may 
not hold, in the limit of a large number of trials. In mechanics an analogous 
situation would occur in discovering a point mass for which one could find, 
after a few direct measurements of force and corresponding acceleration, that 
the two physical entities are not proportional. 

Alternatively, it might happen that the probability law (€,p), assumed 
as modelling the phenomenon under analysis, foresees occurrence frequencies 
different from the observed ones: this circumstance would have the analogue, 


2g “Seguendo lo giudicio di costei/ che occulto come in erba langue” (Dante, Inferno, 
Canto VII). 

21 € could be the six faces of a dice and a “try” could be one tossing of the dice (after 
suitably “shaking” it); and the produced event would be the upper face of the dice after 
tossing: if the dice is “fair” then p(e) = z. 


9 
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in the mechanics of a point mass, of a case where we had “forgotten” to list 
some force f among the forces acting on the point. 

The discussion on the notion of probability and on its empirical interpre- 
tation will be stopped here. One could continue it for much longer, at the risk 
of making the issue and the content of the analysis increasingly nebulous. In 
fact it is more useful and constructive to illustrate the content of Definition 
18 via a few applications to the problems which interest us. 

To have at hand a more flexible language, it is convenient to agree on a 
few more “simple” definitions. First comes the notion of “random variable”. 


19 Definition. Let (€,p) be a probability distribution. 

(i) Any real function f on E will be called a “random variable”. 

(it) If a1,a2,...,an(f); are the pairwise distinct values taken by f(e) as e 
varies in E, we shall call E1, E2,..., Ençp) the corresponding sets of events 
of E; ie., fori = 1,2,...,n(f) the set E; consists of those elements e € E 
such that f(e) = ai. The sets (E1,..., En(f)) are pairwise disjoint and their 
union is E. Therefore, they form a “partition” Py of E, which will be called 
“partition of E associated with f”. 

(iii) The “probability distribution” of the random variable f is the probability 
distribution (Ir, Py), where If has as elements the n( f) sets E1,..., Enf) and 


P; (Es) = p(E:) = X ple) (2.23.6) 
e€E; 


(iv) More generally, if P is a partition of E into n sets (E1,..., En), we shall 
define (Ep,pp) the “probability distribution associated with P” as being the 
probability distribution in which the elements of Ep are the sets constituting 
the partition P and, if E €P, 


pp(E) = p(E) = > ple) (2.23.7) 
ecE 


Observation. The notion of the probability distribution of a random variable is 
a relevant one when we are only interested in the random event e € E via the 
value f(e). It is in fact clear that we can identify all the events e € P giving 
rise to the same value of f(e) and call “event” such a collection of events. 

Suppose, for instance, performing a measurement of a quantity g and that 
such a measurement is affected by an error which can be thought of as due to 
N “causes”, all independent from each other and each producing an additive 
error on the value of g which is te with equal probability. A complete de- 
scription of the error is therefore a N-tuple e€ = (€1,...,€y) of numbers which 
take the values £; = +e; the hypothesis of independence and equal probabil- 
ity of the various errors will be translated into a model by saying that all 
the N-tuples € are equally probable; i.e., on the space £ of the 2% sequences 
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e = (€1,...,€n), with c; = te, the probability distribution?? p(e) = 2-% is 
defined. 

Suppose, however, that we are not interested in knowing the details of the 
individual errors occurrences but just in the total error: 


N 
fe) = ei (2.23.8) 


This is a random variable on £. It can take the values Ne, (N — 2)e,(—N + 
2)e,—Ne, and the value (N — 2k)e is taken on all the sequences e€ containing 
exactly k minus signs: call Æp the set of all such sequences. Then the set Ip, 


in this example, consists of N + 1 elements Eo, F1,..., Ey and 
1 1 /N 
ps (Ei) = (Ei) =) sx = sh) (2.23.9) 
e€E; 


The probability distribution (Iş, Py) can be regarded as a model for the total 
error without explicit reference to the elementary errors €;. 


The preceding definition provides a method for building new probability 
distributions, starting from a given probability distribution. It is useful, in 
this respect, also to give the following definition providing another way of 
constructing new probability distributions starting from a given one (E, p), as 
suggested by the above observation. 


20 Definition. Let (£, p) be a probability distribution. Let N be a positive in- 
teger. We shall denote (E€,p) as the probability distribution on EN associating 
with the event e = (e1,...,en) E EN the probability p)(e): 


p™ (e) = p(er)p(e2) ---p(en). (2.23.10) 


The distribution (E,p)N will be called the “distribution of N events indepen- 
dently extracted with distribution (E, p)”. 


This series of definitions, necessary to establish a concise and suggestive 
language for the formulation of some interesting propositions, will be con- 
cluded by describing the important notion of a sequence of random variables 
converging in probability to a constant limit. 


21 Definition. Let (En, pn), N = 1,2,..., be a sequence of probability dis- 
tributions and let fy be a random variable defined on En, N = 1,2,.... The 
sequence (fn)X_, of random variables is said to “converge in probability” to 
a limit L E€ R as N > ov, if 78 


22 This is a celebrated error model. It was used by Gauss for his mathematical theory of 
errors, one of the first grandiose applications of probability theory. 

23 We use the convention that {ele € A, f(e) € B} means “subset of A consisting in those 
e’s such that f(e) € B. 
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im pn({ele € En, |f(e) —€ > ¢}) =0 (2.23.11) 


for alle > 0. 
Let us provide some examples. 


33 Proposition. Let (€,p) be a probability distribution and let f be a random 
variable on (€,p). Define the random variable fy on (E,p)% as 


N 
1 
fne) = 7 DL fle) (2.23.12) 
i=1 
fe= (e1,...,en) E EN. Then the sequence fy converges in probability to 
f= ece ple) f(e) as N — œ. 
Observations. 


(1) This proposition (“law of large numbers”) tells us that the average value 
of a sum of N independent random variables is “almost constant” if N is 
large or, better, that the probability that such an average value differs from a 
certain constant f by more than a given quantity € approaches 0 as N — oo 
[see Eq. (2.23.12)]. 

(2) This proposition clarifies why the quantity X` eg p(e)f(e) is called the 
“average value” of the random variable f with respect to the probability 
distribution (E, p). 


The proof of Proposition 33 relies on a very elementary but very important 
inequality (“the Chebysčev inequality”) which underlies many probabilistic 
estimates. 


34 Proposition. Let f be a random variable with respect to the probability 
distribution (E, p). Define the “k-th moment” of f as 


melf) =J IFO ple), kez, (2.23.13) 


sEE 
Then for k € Z, and ô > 0, 


P({ele€ E,|f(e)| > 5}) < aD) (2.23.14) 


Proor. By Eq. (2.22.13), 
u(f)> XO |Fle)lp(e) = 5* XO ple) = d*p({ele € E,|F(e)| > 5}). 
eSis Lees 


(2.23.15) 
mbe 
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PROOF OF PROPOSITION 33. By applying the Chebyséev inequality to the 
random variable fy — f, one finds 


pn({e|e€ €%, |fy(e) — Fl > dh) < 5 (2.23.16) 
where 
= 1 aud —=,\2 
W2 = 5 pn(e)(f(e) — f)? = 5 ([[ reo) (4 So (flex) - F)) 
ecEN C1 ye, en i=l j=l 
N N 


==> > [eD Fes) - (2.23.17) 


N 
E eD (Fes) -AE -A (2.23.18) 
= Y ples)pler) (fles) -AE le) -P = O pele) -P= o0 


by the definition of f = X „cg ple) f (e), if j # k. 
The last member of Eq. (2.23.17) can be similarly computed yielding 


m= aN (So vle(Fle) -F)) =F, (2.23.19) 


where o? = X, p(e)(f(e) — F)? and the proposition is proved as u2 =z 0 
[see Eq. (2.23.16)]. mbe 


Observation. Note that Eqs. (2.23.16) and (2.23.19) show more: they imply 
that the probability of the event |fx(e) — f| > dn tends to zero as N — oo. 
provided the sequence dy is such that Nd, Wow 9, ie. provided ôy does 


1 
not go to zero faster or as N72. 


Also the problem of the determination of the average value of an observable 
over a sequence of times succeeding each other at time intervals a + ¢, where 
the choice of the sign + is a random choice in the sense informally discussed at 
the beginning of this section, can be easily dealt with by the above techniques. 


35 Proposition. Let f E€ C~(R) be a periodic function with period T > 0. 
Consider the probability distribution (EX ,pn) on the space EN of the N-tuples 
e = (€0,..-,€n-1), E&i = te, 1=0,...,N —1 where 
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pw(e)=2-%, Vee é®. (2.23.20) 
Given a > £ with a/e irrational, consider the random variable on (EN , pn) 


N-1 


My(e) = L ye fGatert...+en-1). (2.23.21) 
j=0 
Then Lt 
Jim Mw () = zl f(r)dr =F (2.23.22) 


in probability. 


Observations. 

(1) The interest of the proposition is that, even if some measurement errors in- 
volving the successive timing of the observations are present, the average value 
of f, computed using the data successively obtained, has a large probability 
of being close to the “ideal” average value, i.e., to the continuous average, 
independent of a and £, provided a/e is irrational. 

(2) The coincidence of the stochastic average with the continuous average 
depends upon the irrationality of a/e, but not on the value of T: it is therefore 
a property of the structure of the measurement (through the parameters a 
and £) and does not depend on the characteristic properties of the observable 
f; unlike in the comparison between the two ideal notions of the average 
(continuous and discrete with step a) where the rationality of T/a was relevant 
(see Proposition 32). 

(3) With the same methods of proof, the above proposition could be extended 
to the case when €; takes more than two values: €, = +aj,...,+ax%, and 
plaj) = p(—a;). In this case, the condition “e/a irrational” will be replaced 
by the condition “there is at least one value @ among the values of a; such 
that @/a is irrational”. 

(4) Finally, always with the same technique of proof, one could treat the case 


e/a rational, and this would lead one to conclude that M y (e) still converges in 
probability to a well-defined limit expressible in terms of the Fourier transform 
of f [with a result analogous to Eq. (2.22.13) generally involving T as well; see 
Problem. 17 at the end of this section]. The difference between this new limit 
and the continuous average could be measured by “the commensurability of a 
with respect to £” [see observation 6 to Definition 17, p. 111 (for an analogous 
comment) and Problem 18 at the end of this section]. 

When the error takes more than one value, as in Observation 3 above, this 
difference depends on the maximum degree of commensurability between a 
and the values of the various errors. It is sufficient that among the various 
errors there is one with respect to which a is “little” commensurable to imply 
that My(e) converges in probability to a value very close to the continuous 
average of f. 

For this reason, it is rare that the stochastic average sensibly deviates from 
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the continuous average: in the concrete situations, there are always several 
causes of errors and, correspondingly, a; can take very many different values. 
Necessarily, a will not be too commensurable with respect to many of them. 


PrRooF. The proof is basically a check, relying on the Chebyscev inequality. 
Consider the Fourier representation of f: 


fO= > fe le™, (2.23.23) 


neZ 


and take into account that fo = f = T7! He f(r)dr: 


oaths <“ 1 N-1 E 
Mnle)-F=5( f(ja +04 e;)) 7 
j=0 
A west p 
SpA etaten) (2.23.24) 
j=0 
7 EA eins 
T Ín = ern ja+e1+...+Ej , 
an 


Hence, to apply the Chebysčev inequality, compute the second moment of 
M~y(e) — f using Eq. (2.23.24): 


plN) =F (Mn le) - Few (€) = sy Me) - F)? 


E E€ 
1 0,N—1 1 
A A 2nrif; : 
F Pa Pra 3 Sa En (2.23.25) 
ie ee J1J2 


er (cot... +€ 5, jni +24 Getae, 


The series over ni, is term-by-term bounded by the convergent series 
ening fnil |fnal: in fact, the factor within curly brackets is a sum of N?2N 
addends each with modulus 1/N?2% and, therefore, its modulus does not ex- 
ceed 1. Hence, the series in Eq. (2.23.25) is uniformly convergent in N and its 
limit as N — oo can be computed under the summation sign (i.e., term by 
term). It will turn out that all the terms in curly brackets in Eq. (2.23.25) tend 
to zero as N — 00; hence, u2(N) -zz 0 which, by the Chebyséev inequality, 
will imply Proposition 35. 

The contribution to the sum inside the curly brackets in Eq. (2.23.25) com- 
ing from the terms with jı = jọ involves N2™ terms with modulus 1/N?2%. 
Hence, it tends to zero as N — oo. Therefore, it will be enough to consider 
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the terms with jı < j2 and to show that their contribution to the sum is also 
infinitesimal as N — oo. The terms with jı > j2 can be similarly treated. 

Suppose j2 > jı: the contribution to the curly bracket term from such 
addends is 


1 Na? Nal 
eet eT (jimiatjenea) 
N2 X : > ) 
Jjı=0 j2=jı +1 (2.23.26) 
. orf 5 et R E a 
2 
which, by successively performing the summations over €y_1,...,€o, becomes 


: HO j 2 aj 2 . 

N2 a> gre HITE (cbs arena) (cos Selm + nz))" 
ji=0 jo=jitl 
1 Sp 25i 27 PTE 2T a 

= WI 5 5 (e F (nı+n2)a cos yem $ n2))” ("Fn Cos TD 


ji=0 jo=jitl 


(2.23.27) 
The summation over j can now be performed, noting that if no Æ 0, 


Qri 


2 
A = e'F™ cos eng #1 (2.23.28) 


because |A| =< 1 and if A = 1 the number ¢/a would have to be rational, 
regardless of T. The result of the sum in Eq. (2.23.27) over jo is then 


A ee is Weg 
N2 y (e T (nı+n2)a COS —e(ny + mz)” oa Ta ae (2.23.29) 
ji=0 


and this sum involves (N — 1) addends each with modulus bounded by 
N~?54.. Hence, it tends to zero as N — ov. mbe 


2.23.1 Exercises and Problems 


1. Consider the “fair probability” distribution (E, p) on a set of six events E = {1,2,...,6}, 
pj) = z (“perfect dice”). Compute the probability distribution for the following random 
variables (see Definition 20): 


: 1 if i is even, : 1 if 7 = 1, 2,3, 
Al ={ =f 


—1 ifiis odd, —1 ifi= 4,5,6.. 


. Compute, in (€,p)%, the 
= (€1,---,€n) E EN, and 


2. Let E consist of two elements +1 and —1 and let p(+1) = 
moment u2 of the random variables f(e) = (e1 +... + en), 
f(e) = sin (e1 +... +e). 


i 
2 
E 
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3. Same as Problem 2 with p(+1) = 2,p(-1) E 3. 


4. Compute the limit in probability of the random variables (€1+...+¢n)/N, as N — +00, 
in (£, p)”, where £ = {—1, +1}, p(-1) = $, pH) = $- 


5. Consider the stochastic average with step a = V2 with respect to an error distribution 
with the scheme € = {—<,¢}, p(+e) = 3, E= $ for the observable “kinetic energy” on the 


energy 1 motion of the oscillator +a = 0. Estimate the number of measurements N needed 
for finding that the average over N observations deviates from the stochastic average (i.e., 
from the case N = +00) by 10% at most with a probability of 99%. 


6. Same as Problem 5 with error scheme E = {—e,0,¢}, p(te) = 4, p(0) = 4, using the 
observable “potential energy.” 


7. Same as Problem 5 for the motions of the oscillators # + .& + x = (1 — cost)? for the 
observable “work done per unit time by the forcing force. “ 


8. Interpret eke logi = logn! as an approximation for the integral between 1 and n+ 1 
of the function € — log € and, using this interpretation, show that 


1 
0 < logn! — n(logn — 1) < 14 tlogn, ie 1< 
n 


9.* Using the “Stirling formula” (see Problem 14): 
= 1 
nl=n"e""V2rn (1 + O(=)), 
n 
show that the probability that 


El + Seve) + EN 
e) = ————_ € [a,b 
fn(€) JN [a, b] 
with respect to the probability distribution (€,p)%, where € = {—1, +1}, p(+1) = $, 
converges to 


b a? dx 
e 2 


a Vin 
as N — +00 (”Gauss’ theorem“). (Hint: Recall Eq. (2.23.9) to see that the probability 
that at sen takes the value (N — 2k)/V N, k = 0,1,..., N, is given by ZEN (N); then 
express the factorials in (eS) via the Stirling formula, recalling that k must be such that 
a < (N — 2k)/VN < b, etc.) 


10.* Show that the statement in Problem 9 implies that the sequence of random variables 
fn considered there does not converge in probability as N — oo. 


11. Assuming the result in Problem 9, show that the sequence 


(a) _— €1 tt... TEN 

fy (€) = — poa a 

of random variables with respect to the probability distribution considered in Problem 9 
converges to zero in probability if a > 1, does not converge if a = 1 (see Problem 10), and 
diverges if a < 1 (in the sense that the probability that [fF (e)| < a approaches zero, as 
N — o.) 


12. Show that the probability py that the random variable fy (€) introduced in Problem 
9 is positive approaches 3 as N — +00 (Hint: Distinguish N even and N odd.) 
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13. In the context of Problems 9 and 12, estimate how fast |py — 3 | —0O0as N — ov. 


14.* Prove Stirling’s formula with a constant I" instead of 27, leaving aside the determina- 
tion of I’, refining the argument in Problem 8. (Hint: 


ogni =X togi= > f logide = 5 | log(a — (x — i)) dx 
i=2 i=2i—1 i=2 Y i—1 
n 4 ee n 
=f [tog + log(1 a *)| av = f log x dx 
ja2/i-1 T 1 
D i a? iy z ija orl (e-¥) | 
og(1 — — |dx ——_ dx 
j= /i-1 T T jan /i-1 T 


n n 


=n (logn — 1) 4 Dn H 5 (1 ilog —). 


1=2 1=2 


where y; denotes the second integral in the intermediate step. Then || < const i7? and 


1+ilog +; = x } z H... so that 


a1 Š 
logn! = n(logn — 1) 4 pe - 4 De: 


with Ji < const i™?. Since 77, 4 = logn — Č + O(4) with C suitably chosen (see next 
exercise), it follows that 


= ids 1 
logn! = n(logn — 1) + log Vn — C Da | Or) 
i=2 


so if I = exp(C + 5X? Ñi), it follows that n! = n"e~" yn T (1+ O(4)).) 


15. Show that Dt = logn — C + O(4), where C is a suitable constant (“Euler- 
Mascheroni constant”) (Hint: 


and show that |7;| < consti~?. 


16.* Complete the derivation of the Stirling formula begun in Problem 14 by showing 
that [ = V 27. (Hint: Use Problem 9, with I instead of 27, which says that the random 


a2 
variables fye) lie in [—A, A] with a probability converging to TA e 2 dx/VT (if one does 
not suppose I’ = 27 yet). Then, by estimating the factorials in Ceo by using the 
Stirling formula with T instead of 27, see Problem 14, show that 


NNa 
|N—2k|//N>A 


2 
uniformly in N: this implies that rS e` T dx//T = 1; hence, I = yr. The estimate on 
the >> 2N (X) is quite delicate and should be decomposed into two estimates: for instance 


the first for E — 4| E€ LA bl and the second for p — 4| E€ [zo isl) 
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17. Show that if a/e is not assumed to be irrational, Eq. (2.23.22) becomes limn—oo My (e) 
= J; fn = f, where fn are the harmonics of f and $`% is a sum running over the n’s such 


that (exp(2#na)) cos Ane =f 


18. Deduce from Problem 17 that the limit in Eq. (2.23.22) coincides in probability with 
the continuous average not only if a/e is irrational, but also if T is irrational with respect 
to either £ or a. Also if ¢/a = p/q, with p and q relatively prime integers, and if a is varied 


so that p —> œ,a — a@> 0, then f > f. 


2.24 Extremal Properties of Conservative Motion: 
Action and Variational Principle 


Since the construction of the entire universe is absolutely perfect and is due to 
a Creator with infinite knowledge, nothing exists in the world which does not 
exhibit some property of maximum or minimum. Therefore, there cannot be 
any doubt whatsoever about the possibility that all the effects are determined 
by their final aims with the help of the maxima method, in the same way in 
which they are also determined by the initial causes. 


Equilibrium positions of a point mass on a line are identified with the 
points where the potential energy is stationary. Thinking of equilibrium as a 
particular form of motion, one can ask whether the other possible motions 
of a point mass, developing under the action of a conservative force with 
potential energy V, can be characterized by similar stationarity properties. 
This analysis will also be useful as a first illustration of the content of the 
above quoted proposition of Euler. A deeper analysis will be the object of 
Chapter 3. 

Consider a point with mass m > 0 moving in the time interval [t1, t2] from 
the position £ to the position £2: such a motion is a C™ function t > a(t), t € 
lti, tel, such that u(t1) = Ei; x(t2) = Éz. 

Let Mi ta (£1, 2) be the set of all C motions t > x(t), t € [t1, t2], such 
that x(t1) = £1, z(t2) = £2. If V € C®(R) is a given function bounded from 
below, it makes sense to consider the motions of the point taking place under 
the influence of the force generated by the potential energy V. Such motions 
are a very restricted class in Mz, ta (£1, 2) possibly empty. 

The inquiry subject will be whether there is a real-valued function A de- 
fined on Mz, ,ta (£1, €2) which takes a minimum value or, at least, is stationary 
on the motions which, under the influence of the force with potential energy 
V go from &), to £2 as t goes from tı, to te . 

The meaning of this question has to be clarified by a preliminary discussion 
on the meaning of “extremality” of a function defined on a set of motions, i.e., 
on a set of other functions. Attention will focus on special functions defined 
on Mz, ta (£1, €2): those having the form 
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t2 
A(x) = / L(a(t), x(t), t) dt (2.24.1) 
ti 
where L € C®(R?) associates (n, £, t) with L(n, £, t). 

Eq. (2.24.1) associates a real number with every x E€ Mz, ,ta (£1, £2). This 
number is called the “action of the motion x with respect to the Lagrangian 
function £.?4. The notion of “stationarity” or “extremality” of A is very 
natural in terms of the related notion of “varied motions“. 


22 Definition. Given x E€ Mi t.(€1,€2) and a real function (t,e) — y(t,e) 
in C™® (|ti, t2] x (—1,1)) such that 


(i) y(t,0)=a(t), Vt fti, tol, (2.24.2) 
(ii) yltue)=&, ylto,e)=&, Vee (-1,1), (2.24.3) 


The function y is said a “variation of x” inside Mz, 4.(&1, €2) parameterized 
by € € (—1,1). The set of all variations will be denoted by Vx. 

More generally, if M is a subset of Mz, t.(€1,€2) we shall denote Vx(M) 
the set of the variations of x such that, Ve € (—1,1), the function 


t > yelt)=y(t,e),  t € [tr t2] (2.24.4) 
Observations. 


(1) We can imagine that a varied motion y is a bundle of motions with equal 
initial and final data (see Fig. 2.13). 


ty t2 
Fig.2.13: Illustration of the variations (dashed curves) of a motion x (solid curve). 


(2) Occasionally it will be useful to think of a variation of x € M C 
Mi ,t2(€1,€2) as a “regular curve” in the space M: for every e € (—1,1) 
one has a point ye E€ M and yo = x [see Eq. (2.24.4)]. 


24 For the origin of this name, see the remarks on p. 164 and 241 
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(3) If F is a function on M C Mz, 1, (61,62) and y E€ Vx(M), it will make 
sense to consider the function of € € (—1,1):¢— F(ye), “value of F along 
the curve y through x in the point parameterized by €“. 


It is now possible to give a precise definition of stationarity. 


23 Definition. Let M C Mz, 1,(&1,&2) and let A be a function on M having 
the form of Eq. (2.24.1). We shall say that x E€ M is a “stationarity point” 
for A in M, if for every y E Vx(M) the function [see Eq. (2.24.4)] 


e— Alye), e € (-1,1) (2.24.5) 
has a stationarity point in € = 0, i.e., 
d 
GAs leno =0. Wye Vx(M). (2.24.6) 


Observations. 


1) In other words, x is a stationarity point for A in M if on every regular 
curve y through x, the function A, thought of as a function of the parameter 
€ parameterizing the curve, has a stationarity point in £ = 0, i.e., “in x“. 

2) In the theory of maxima and minima of functions F € C®(R2), there are 
various equivalent definitions of the stationarity points; for instance, 

a) $* (Vx) =0,i=1,2,...,d. 

b) On every C™ curve £ > ye, € E€ (—1,1), through x = yo, the function 
F — F(y-) has zero derivative with respect to € in € = 0. 

Definition (b) is the “finite-dimensional” analogue inspiring Definition 23: 
intuitively, one can think of x € M:,4,(€1,€2) as a vector with infinitely 
many components x; = x(t), t € [t1,t2], not independent, however, since 
they are constrained by the condition that t — a is in C% ([t1,t2]) and 
Tt, = £1, Tt, = &. 

(3) Strictly speaking, one should prove that Eq. (2.24.6) makes sense, i.e., 
that € > A(ye) is differentiable in e. But this is an immediate consequence 
of the differentiation rules for integrals. Actually, it is easy to find explicit 
expressions for the derivatives of A. For instance, from Eqs. (2.24.1) and 
(2.24.5), it follows that 


= i af% UER f i e)) (2.24.7) 


f : : Oy + Oy 3y te Dy 
and shortening the notations for 3 (t,£) in 5? and for 5-4 (t,£) in 5z% and 


for y(t, £) in y, etc., Eq. (2.24.7) can be rewritten: 
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d E 2 OL ,Oy Əy OL, Oy Oy 
Aly.) | att oe (Se) Zep + Be a Ze f- (2.24.8) 


Avoiding explicit indication of the arguments Oy/0t, y,t in £ and in its deriva- 
tives, a straightforward computation yields 


d? 2 ƏL OPy\2 OL dy Oy 
——= A e) = t —. | ———. — —— 
qe lve) f g a (se) t 38n Bedt ðe 


(2.24.9) 
oc y , PL Py oy, PL Ou LPN 

On Oc20t ðnðeðeðtðe OE? `e OE Oc? 
The higher derivatives could be evaluated with similar procedures; i.e., € = 
A(ye-) isa C™ function. 


As in the case of the functions on R4, it is convenient to distinguish be- 
tween stationary points and points of “local” or “relative” minimum. 


24 Definition. If x E€ Mz, 1,(&1,&2), we say that x is a”local” minimum for 
A defined by Eq. (2.24.1) on M if for all varied motions y E€ Vx(M), the 
function €e > Aly.) [see Eq. (2.24.5)] has a relative minimum in £ = 0. 


Observations. 
(1) A has a local minimum in x on M if on every regular curve y through x 
lying on M, it has a local minimum in x. 
(2) A necessary condition for A to have a local minimum relative to M in 
x € M is that x is a stationarity point for A on M. 
(3) A necessary condition in order that a stationarity point for A on M isa 
local minimum on M is that 
d2 
L Alya] >20, — Yy € Vx(M) (2.24.10) 
de? e=0 
if x is the point of stationarity. 
(4) If x € M is an absolute minimum point for A on M, i.e., if A(x’) > 
A(x), Vx’ € M, then x is also a local minimum point for A on M. 
(5) If A has a local minimum in x relative to M it must be that, given 
y E€ Vx(M), there is 7 > 0 such that if e € [—n, 7], then A(y-) > A(x): this 
value of 7 may, however, depend on the choice of y. 
(6) One could be tempted to define a local minimum by requiring that A(x) < 
A(x’), Vx’ € M and “close enough” to x. But the meaning of “close enough” 
would be unclear. 


A necessary and sufficient stationarity criterion, which is as “simple” as the 
one usually considered in the case of the stationarity of functions on R? and 
concerning the vanishing of the gradient (see Observation 2 (a), to Definition 
23), is the following. 
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36 Proposition. The motion x © Mz, ta (£1, €2) is a stationary point for Eq. 
(2.24.1) on all of Mz, 1.(&1, €2) if and only if 


dal. aL 


Hoy GOO) = EOD) VEE [ta ta} (2.24.11) 


Observations. 

(1) In this proposition, it is essential that the set M C Mz, 4, (£1, €2) on which 
stationarity is considered coincides with Mz, ‚tə (£1, £2) itself. 

(2) Equation (2.24.11) can be thought of as a differential equation for the 
function t — a(t), t € [t t2], i.e., as an equation for the determination of the 
stationarity points of A on the entire set Mz, ta (£1, 2) . When Eq. (2.24.11) 
is viewed in this way, it is called the “Euler-Lagrange” equation for A or £. As 
it emerges from the proof, it is analogous to the condition of vanishing in the 
stationarity problem for functions on R? (see observation 2 (a) to Definition 
23, p.128). 

(3) It has to be kept in mind that, in general, Eq. (2.24.11) is not a differential 
equation in the sense of Definition 1, p.14: in important cases, however, Eq. 
(2.24.11) is equivalent to a differential equation in that sense (see Problems 
4-6 at the end of this section). 


PROOF. It reduces to a check. Let y € Vx and set 


z(t) = TG 0), te [t1, t2], (2.24.12) 
ay 
Z(t) = Fide (t,0), t € fti, ta] (2.24.13) 
and note that Eq. (2.24.2) implies, Vt € [t1, ta]: 
Oy . 
y(t, 0) = a(t), vals 0) = a(t) (2.24.14) 
while Eq. (2.24.3) gives 
z(t1) = 2(t2) =0 (2.24.15) 


Then, with the above notations, Eq. (2.24.8) becomes 


d2 
— A(ye 
722 (Ye) 


2 fo) eae ; OL 
= iA att 5, (Ut). 20,20 + Be CO nC). 2(6)}. 
(2.24.16) 
As yz varies in Vx, the function z defined by Eq. (2.24.12) spans the entire set 
Mt ta (0,0). In fact, Eq. (2.24.15) shows that z E€ Mz, 2,(0,0); furthermore, 


given arbitrarily Z E€ Mz, +,(0,0) and setting 


E= 
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F(t, €) = x(t) + ex(t) (2.24.17) 


for € € (—1,1), t € [t1, t2], one constructs a varied motion y € Vx, which, via 
Eq. (2.24.12), exactly generates Z. 

The wide arbitrariness of z in Eq. (2.24.16) can then be used to deduce 
conditions on x. For this purpose it is convenient to eliminate z(t) from Eq. 
(2.24.16) by integrating the first term by parts; one finds: 


OL t2 
A)| _ = E 


a (EH, 2,12] 


p K (5 (F e009) ; ORON He 


ti 


(2.24.18) 


which, by Eq. (2.24.15) and by the preceding remark on the arbitrariness of 
z, shows that (dA/de)(y-)|-=0 = 0, Vy E€ Vx, becomes: 


= L {5 (G40. «00),0) = E ew, a(t), t) b 2(t)dt, (2-24.19) 


Vz € Ma ta(0,0). The equivalence between Eqs. (2.24.19) and (2.24.11) is 
implied by the principle of vanishing integrals (see Appendix D). mbe 


As a consequence of Proposition 36, it is possible to answer the ques- 
tion raised at the beginning of this section. In fact, if one defines for x € 


M4, ta (£1, £2). 


A(x) = | f (=ma(t)” — V(a(t))) dt (2.24.20) 


the following proposition holds. 


37 Proposition. The motion x of a point, with mass m > 0 developing from 
E to & in the time interval |ti, t2] under the influence of a force with poten- 
tial energy V E€ C®(R), makes the action of Eq. (2.24.20) on Me, to (£1, €2) 
stationary, i. e., it makes stationary the action with Lagrangian density 


L(n, £, t) = smn —V¢(€). (2.24.21) 


PROOF. In fact, Eq. (2.24.11) becomes 


ds, OV 
amet) = ~ ae CO) t € [t1, t2] (2.24.22) 
which is the equation of motion. mbe 


Furthermore, the following interesting proposition holds. 
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38 Proposition. Let t > Z(t),t € R, be a motion of a point mass 
with m > 0 developing under the action of a force with potential energy 
V € C™®(R), bounded from below. Given tı € R, there exists t > tı such 
that if t2 € [ti,t] the motion t — Z(t) observed for t € [ty, ta], i.e., as an 
element of Mz, ta(T(t1), Z(t2)), not only is a stationarity point for the ac- 
tion with Lagrangian Eq. (2.24.21), but is also a local minimum for it in 


Mu, to (Z(t1), Z(t2)). 


Observation. The proposition motivates the name “principle of the least ac- 
tion” occasionally given to the Propositions 37 and 38. 


PROOF. By the observation 5 to Definition 24, p.129, given y € Vg, we must 
find a ny such that A(y-) > A(X), Ve € [-ny; ny]. 

Given t2 > tı andy E Mz, 4, (%(t1), Z(t2)), define ny so that |y-(t)—Z(t)| < 
1,Vt € [t1, t2], Ve E€ [—ny,7,]. The comparison of A(y-) with A(X) yields 


A(ye) — A(X) =f ERGO + 2(1))°(t) ) 
— (VEE) + 2() — V (T))) } at, 


where we set z(t) = ye(t) — T(t), t € [t1, t2]. This is a function z which has 
the property 


(2.24.23) 


z(tı) = z(t2)=0 (2.24.24) 
and it is € dependent. To show that Eq. (2.24.23) is > 0, apply the Taylor- 
Lagrange formula (see Appendix B): 


vie) -VE = O E (2.24.25) 


where y € C® (R?) is a suitable function. Then Eq. (2.24.23) becomes 


e (2.24.26) 
+ [ma(t) z(t) — zg CO0] } dt 


Integrating the first term in the second set of square brackets by parts and 
using the equation of motion for %, Eqs. (2.24.22) and (2.24.24), one realizes 
that the integral of the term within the second set of square brackets in Eq. 
(2.24.26) vanishes. Therefore, if 


M= max |y(x(t)+¢,Zz(t))I, (2.24.27) 
ere ha 


one sees that, if |e] < ny, Eq. (2.24.26) implies 
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t2 M t2 

A(y-) — A(X) > 7 z(t)? dt — = / z(t)? dt; (2.24.28) 
tı 

if t2 — tı < 1, which is a condition that can be implemented by supposing 

t2 < t and, without loss of generality, 


<t +l. (2.24.29) 
On the other hand, since z(t1) = 0, 


z(t) = f 2(r) dr, (2.24.30) 


and applying the Cauchy-Schwartz inequality (see Appendix A), which gen- 
erally looks like, V f, g € C™ ([t1, t2]), 


ae f(r) g(r) dr| < es gerjear)*( g(r)dr) (2.24.31) 


ti 


one finds, 


J a= f If aty-rarPars f au( f seal 1dr) 


< T dt(t — TE 2(r)?dr) = h [ 2(1)*dr 


t t 
f i (2.24.32) 
from Eq. (2.24.30). Hence Eqs. (2.24.28) and (2.24.32) mean 


1 M t2 
Aly) — A(X) > =(m ee ee t1)*) f Ar)2dr (2.24.33) 
2 2 th 
which implies A(y-) — A(X) > 0 if t2 € [t1,#] and if ț is close enough to tı 
(precisely so that t—tı < 1 and 2m— M (t—t1)? > 0), Vy € Vx, Ve E€ [—ny, ny]. 
mbe 


In the context of Proposition 38, one can wonder about what happens when 
the interval [t1, t2] is not small: and one realizes that it is always possible to 
cut the interval |t, t2] into finitely many small intervals such that the action 
is locally minimal on the variations of the restrictions of X to such intervals. 

This situation is strongly reminiscent of the properties of the geodesics on 
curved surfaces. For instance, on a sphere, a line joining two points along a 
great circle (“geodesic of the sphere” ) has the property of being the line short- 
est among all those joining the two points and lying on the sphere, provided 
their distance, measured along the line itself, is small enough. However, if the 
two points are not close enough, it is generally no longer true that such a line 
is the shortest (“close enough” here means closer than 7R if R is the radius). 
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Finally, let us meditate upon the following important comment: we wish 
to stress the fact that the stationarity (or minimality) of A is an “intrinsic 
property”, i.e., it is independent of the way the motion is described. To make 
this precise, let € — y(£) be a C” function defining a “nonsingular” change of 
variables (i.e., such that y’ (£) = RE) # 0). We can then use as a coordinate 
for the point € the quantity 7(€). 

Let I’ be the inverse function to y defined on the open interval I = 7(R) = 
y-image of R. Suppose, for simplicity, yR) =I = R. 

A motion in R, t > a(t), t € [t1, te], can be described by the function 
t — s(t) = y(x(#)), t € [t1, t2]. We shall say that such a function describes the 
motion x in the system of coordinates on R associated with the function y. 
There is a one-to-one correspondence B between motions x E€ Mz, ta (£1, £2) 
and motions s E Mz, +,(y(&1), 7(€2)): it is established by the relations 


s(t) =r(2(t)), 2()=P(s(), tE ht] (2.24.34) 

The correspondence of Eq. (2.24.34) will be denoted by s © Bx. Let I” be 
the derivative of I’, then s = Bx implies 

x(t) = I’(s(t)) (0), (2.24.35) 


and it has to be remarked that the Lagrangians 


Lm E t) = =n? - V8), (2.24.36) 


Len = “1? — Vir), (2.24.37) 


attribute the same action to the motions x € Mz, tə (£1, 2) and, respectively, 
s E€ Mz, t.(7(&1), V(é2)); i.e., if s = Bx, 


A(x) =f a —V(a(t))) 


=Ais) = [a (EON SO" — v(r(s(t))) at 


which follows from Eqs. (2.24.34) and (2.24.35). 
If y E Vx(M) it is natural to associate with y the element B y E€ Vs(BM) 
(Bu)(t, e) F yule), (te) € [tata] x (-1,1) (2.24.39) 
and BM C Mun,ta(y(£1), y(€2)) is the image of M via the map of Eq. 
(2.24.39). 


It is then an immediate consequence of Definitions 23 and 24 that if A is 
stationary or locally minimal on x € Mz 4,(&),€2)) in M , then A also is 
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stationary or locally minimal on s = Bx € Mz, 4.(7(&1), y(é2)) in BM and 
vice versa. In particular, this means that the equations 


T(E eoad) = Eeoa) 
rar az (2.24.40) 
ae gy 6O00) = EOOD 


are “equivalent” if £ and £ are given by Eqs. (2.24.36) and (2.24.37). 

As we shall see, this invariance property of the stationarity (or of the 
local minimality) with respect to changes of coordinates is perhaps the most 
interesting aspect of all the considerations of this section. We shall meet some 
of its very remarkable applications in the theory of systems with many degrees 
of freedom. 


CONCLUDING REMARKS 


(1) In the analysis of this section we always dealt with conservative systems. 
In fact, it is not possible to give a simple formulation of the stationary action 
principle for dissipative motions without introducing singular Lagrangians 
(see Problems 12-15 at the end of this section). 

(2) The action of a motion x with Lagrangian (2.24.36) can be thought of 
as the product of (t2 — tı) times the difference between the average value, in 
tı, t2], of the kinetic energy and the average value of the potential energy: 


t2 ; 2 1 t2 
A d mel) a-— | V(a(t))dt. (2.24.41) 
cots, 2 es 


It is for this reason that one can say that the motion developing for t € [t1, t2] 
between tı, and t2 under the influence of a force of given potential energy V is 
the one that minimizes the difference between the average kinetic energy and 
the average potential energy in every short enough time interval in [¢1, t2]. 

We leave it to the reader to elaborate his own philosophical considera- 
tions on this beautiful mathematical property. The interested reader could go 
through the history of the variational principles in mechanics and, more gen- 
erally, in physics, to understand how subjective considerations (as we would 
call them today) have influenced the formulation of the variational principles 
themselves and the recognition of their equivalence to the Newtonian equa- 
tions of motion; see also the comments on p. 164 and p. 242 and the Euler’s 
quotation at the beginning of this section. 


2.24.1 Exercises and Problems 


1. Compute the action between tı = 0 and tg = 27/w of the motions of an harmonic 
oscillator with mass m > 0 and pulsation w. 
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2. Same as Problem 1 with t2 arbitrary (t2 4 27/w). 


3. Compute the action between tı = 0 and arbitrary t2 of the motions of a point mass with 
mass m > 0 subject to the force f = —mg,g > 0. 


4. Let L E€ C® (R?) be such that the correspondence (n, £) > (0,6), can be inverted 


in class C% as a mapping of R? onto R? and let (p, £) — (f(p,€), £) be the inverse map. 


Set H(p, £) = pf (p, €) — L(f (p, £), €) = [pn — L(n, £)n=F(p,g)» and check that the “Lagrange 
equations” 


i d ƏL OL 

f=", Ton OT Be MF) 
are equivalent to the “Hamilton equations” 

: OH . OH 

p=- PE) E= 3, PE ). 


The motion described in terms of p and £, t > (p(t), €(t)), is a solution of this differential 
equation and any of its solutions is called a “Hamiltonian motion” and the space R?, 
thought of as the space of the initial data for the above equations, is called a “phase space”. 
(Hint: Note that since, by definition of p and f, one has p= ge 7 (f, 8); £), it follows that 


T e= fo, 9+p%o, a- Ly pao Lo, )=feO=n 
and 
T ne PLO.) - FU09.9L0.9 - Emea 
Oe aN 
= - Fre) = - Fae) 


having used the definition of H.) 


5. The function H in Problem 4 can be expressed in terms of £ and vice versa as 


H(p,€) = max(pn —L(n,€)), L(n, £) = max(pn — H(p, £)) 


(“Legendre duality”), if the maximum is attained at a unique point 7 or pP, respectively, and, 
furthermore, if 7, p are the only stationarity points of the functions in brackets as functions 
of 7 or p, respectively. (Hint: Write the stationarity conditions for pn — L (n, €) and those 
for pn — H(p,€) with respect to 7 or, respectively, to p. Then use the definition of H in 
Problem 4.) 


6. The “Hamilton equations“ p = oe (p,€), E= a (p, £). with Hamiltonian H € C® (R2?) 
can be obtained by imposing stationarity of 
t2 
S=) (p(t)&(t) — H(p(t), x(t))) dt 
ty 


in the space Mz,t2 ((m1,£1), (m2, £2)) of the C™([t1, t2]) functions t — (p(t), q(t)) € R? 
such that p(t1) = 71, p(te) = m2, £(t1) = €1,2(t2) = €2, (“Hamilton’s principle”). (Hint: 
Apply Proposition 36, Eq. (2.24.1), with L(p, £, p, x) = px — H(p,x).) 


7. In the context of Problem 6, show that the same Hamilton equations can be obtained by 
imposing stationarity of S on the larger space Mt ta (€1, €2) of the C% ([t1, t2]) functions 
t — (p(t), ¢(t)) E€ R? such that z(t1) = €1,2(t2) = £2. (Hint: Go through the proof of 
Proposition 36 using the special form of the Lagrangian L(p, x, p, x) = pt — H(p, x).) 
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8. Let t — (p(t), x(t)) E€ R? be a motion verifying the Hamilton equations of Problems 4 
and 6. Show that the quantity S defined in Problem 6 coincides with JE L(a(t), x(t) dt, 
i.e., with the action of the same motion (of course, if £ and H are related as in Problem 4). 


9. Extend Problems 4 and 7 to the case when H and £ depend explicitly on time. 


10.* Let H be as in Problem 4 and let S:(p,x) = (p(t), x(t)),t > 0, be the solution of the 
Hamilton equations (as in Problem 14), supposed normal, with (p,x) as initial datum at 
t = 0. Let A C R? be a (Riemann) measurable region. Show that area($;A) = area(A), 
Vt > 0, if SA = {set of points of the form S;(p, x), with (p, x) € A} (“Liouville’s theorem” ). 
(Hint: In general, let x = f(x) be an autonomous normal differential equation in RÊ. Set 
y = Stx, for t > 0. Then 


volume(S;.A) = I dx =} fact (=) dy 
StA A dy 


where O0S;(y)/Oy denotes the Jacobian matrix of the coordinate transformation x = S;(y). 
This formula shows that if det (=) > 0, the modulus symbol is irrelevant and t > 


volume(S;A) is a C% function, and 


P 2 d dS—t(y) 
op olume(St A) t= E det ( öy 1e dy. 


But (see §2.6) Str = S¢S-,, hence, the last expression is equal to: 


k [ia (5) =f E act (FE) _ dy 


OS_1(S_+ OS_+ 
pe SD] we 


t=0 


by the composite function differentiation rule and by the determinant rules. 
It is then sufficient to check that, under suitable circumstances, the derivative 


È act (FE) =o, YxERI 


to infer the volume conservation under the same circumstances. If x = f(x), it follows that 


Sex = x + tf(x) +t? (x,t) 


by the Taylor-Lagrange formula (see Appendix B), where ~ is a C™ function of x and t. 
Hence, 


9 


(i) (i) 
der PSE) aces (1 pg OE Od 5 ga Ok Ded) =) 
Ox Ox ; Ox ; 


hence, by developing the determinant 


OSt(x) IFOX) | 2 

det ==1+t +t t), 

et — 5 . (x, t) 

where ~ is a suitable C™ function of x,t. Hence, the derivative of det(3S+(x)/3x) for t = 0 
(G) 

is ae of) = div f(x), wherein the right-hand side is the notation used in physics for 

I 
the left-hand side (“divergence of f”). 
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Therefore, if divf = 0, the flow S; generated by x = f(x) preserves the volume (this 
also motivates the name “divergence” given to div f(x) since it measures the rate of increase 
of volume under the transformation S+). In fact, it follows from the above considerations 
that det(OS;(x)/Ox) = 1, being constant and equal to 1 for t = 0. Then note that the 
Hamilton equations are divergenceless.) 


11.* Let x = f(x) be an autonomous normal differential equation in R4, f € C® (R4), and 
suppose div f(x) = sae Ah = 0, Vx € RÅ. So, by the hint to Problem 10, it follows 
that the solution flow (St tery preserves the volume: volumeS;A = volumeA. 

Suppose that the solution flow maps a bounded open set 2 C R4 into itself: 5,2 C 2, Vt € 
R+. Show that given xo € N, to > 0, and a neighborhood U C 2 of xo, there exists t > to 
such that S;UNU # 9; i.e., close to any point xo € Q, there is another point which comes as 
close to xo after a given, arbitrarily large, time (“Poincaré’s recurrence theorem”). (Hint: 
Suppose S:,U NU # Ø, otherwise t = to; then consider S2tgU MU # Ø, show that the 
three sets U, StoU, S2t)U must be pairwise disjoint because if S24, N U Æ Ø then t = 2to. 
In the first case, consider $34,U: if S3t U N U = Ø the four sets U, Sto, S2igU and S3toU 
must be pairwise disjoint; if not, take t = 3to, etc. The result could fail only if the sequence 
U, Sig U, Sat, U,..-,SktgU,--- is an infinite sequence of pairwise disjoint sets. However, in 
such a case, volume({2) > D goo volume(Skto U) = X gao volume(U) = +00 because U is 
open and volume(U) > 0, which is absurd since 2 is a bounded set.) 


12. Show that the equation + yt = 0, y > 0 describing a free particle moving under 
the action of linear friction is the Euler-Lagrange equation associated with the Lagrangian 
L(t, x) = tlogx — yx in the region & > 0, [27]. (Define the Euler-Lagrange equations by 
Eq. (2.24.11), i.e. as (d/dt)(OL/0z) = OL/Ozx.) 


13. Let V € C™(R) be bounded below. Show that if F € C®(R) has a non vanishing 
derivative, the equations = —(dV/dx)(x) can be described by the Lagrangian function 


7 F(gy? + V(8)) 
L(n, £) =n f Sty 
1 y 
in the region & > 0, i.e., 7 > 0. What does £ become if F(e) =e, Ve E€ R? Is this consistent 
with the alternative Lagrangian £ = in? — V (£)? (see [27]). 


14. Consider the damped oscillator #+ £ +w?x = 0 and let a = (4w? — y)73, y > 0. Show 
that in the region 7 > 0,€ > 0, the Lagrangian 


L(n, §) = = 5 low? E nE + wE) 4 ao + y)aretg e +7) 


has, as Euler-Lagrange equations, the damped oscillator equations (see [27]). 


15. Let & = g(«,x) be a differential equation. Show that in order that a function £ on a 
subset A of R? generates (via the Euler-Lagrange equations) the equation # = g(«, x) for 
the motions developing in A, it must be 


—9(n, 8)=> =9, V(n,g)EA 


(Hint: Write the Euler-Lagrange equations substituting % with g(x, x)) (see [27]).25 


25 The last four problems are taken from [27]. The equation for £ in Problem 15 allows one 
to find many Lagrangians for the same equation. Note, however, that such Lagrangians 
will generally be singular somewhere in R?, always so, probably, if the equation % = 
g(x, x) is nonconservative. So, strictly speaking, this confirms the fact that a Lagrangian 
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description, in the sense of §2.24, with L E€ C™ (R?) can only be found for conservative 
systems. 


3 


Systems with Many Degrees of Freedom. 
Theory of the constraints. Analytical 
Mechanics 


3.1 Systems of Points 


We begin with some definitions which are perhaps obvious from the consider- 
ations of Chapters 1 and 2, but are nevertheless necessary. 

A notational convention that will allow important formal simplifications is 
that, if M =m ,+m2+...+m, is a sum of p positive integers, the space RM 
will be considered identical with the space R™! x R'™ x...xR™». A point € = 
(€1,...,€) E RY will be identified with the p-tuple of vectors (€,...,€)), 
where EM = (Green E sae Smt) for i = 1, 2, s.. P. 

Very often such a decomposition of € into (Br™,...,€)) will be “natu- 
ral” in the context of the discussion. For instance, if a point in RÌN represents 
a configuration of a system of N point masses, it will be “natural” to think 
of € as (€,...,€), where € € R3, i = 1,...,N, represents the po- 
sition in R3 of the i-th point mass. Every time that it will appear useful, 
when a natural decomposition of € € RMY into (€,...,€), €9 E€ R™, 
i = 1,...,p, emerges from the context, € will be regarded as a p-tuple of 
vectors in R™! x... x R™. 

Such an identification will be made without explicit mention, provided no 
real ambiguities arise. Thus, a R3N -valued function t + y(t) defined on R will 
be written, if this is natural within the context, as t > (gp (t),..., pO? (2) 
with t > y(t), i =1,...,N, an R3-valued function, etc. 

If F is an R2-valued C®-function on RM = R™ x...xR™?, the Jacobian 
matrix 


6 
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OF 
(pe ©) ens 
Of; jal. 
with the symbol (OF /0€)(€) or SS). é = (€M,..., EP) ER™...xR™, 
the symbol 0F/0€) will denote the Jacobian matrix (OF /0&)(€) where 
i = 1,...,d and £ varies in the set of indices corresponding to the coordinates 


of €() (ie, L= m +...+ms_141,...,.m+14+...4+m,). 
We can now set up the following definition. 


1 Definition. A “motion” of a system of N point masses in RÊ, ob- 
served as the time varies in the interval I, is a C® function t > x(t) = 
(x (t),...,xO(t)) defined for € I and taking values in RY = Rx... XRI. 
A motion x of a system of N points, with respective masses m1,...,mn > 0, 
will be said “governed by a force law F” or “developing under the influence” 
of a force law F if: 

(i) F = (£%,...,£) with £© an R4-valued function in C°(R2N), Vi. 
(i) Fori=1,...,N,teTI: 


mi X(t) = £O (xD (8), 2. KO (8), xO), KOM), t) (3.1.1) 


(itt) Eq. (3.1.1), thought of as a differential equation, is normal for all values 
of m1,...,mn > 0 (see Definition 3, §2.5). 


Observation. Requirement (iii) is a restriction of “physical nature” on the 
force laws F that will be considered. Such laws will often be subject to other 
restrictions and, always (beginning with the next section), to the condition of 
verifying the third principle of dynamics (see Chapter 1, §1.3). 


A particularly important role will be played by the “conservative force 
laws”, which deserve a formal definition and the rest of the section. 
2 Definition. A force law for a system of N points in RÌ, i.e., a function 
F € C®(R?INFL) with values in RÌ, verifying (i) and (iii) of Definition 1, is 
called “conservative” if: 


(i) it depends solely on the configuration of the system, i.e., there exist N 
R4-valued C® functions defined on RIN, £,...,£0, such that 


£9 (HM, 2,9), €M,...,€), t) = FO (EM,...,€); (3.1.2) 
(ii) there is a real-valued function V € C?(R4") such that fori=1,...,N: 


xi av(eM,...,€) 
(i) (eC) CNY) es A a 
which will be called the “potential energy” of the force law F. 


The interest of this definition lies in the fact that the majority of force 
models are described by conservative force laws, i.e., by force laws that can 
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be expressed as in Eq. (3.1.3) which, according to the conventions set up at 
the beginning of this section, means: 


P(E... ,g0 = MET ry 


Furthermore, the energy conservation theorem can be easily extended to sys- 
tems of N points subject to conservative forces. Given a motion x of a system 
of N points, with respective mass m1,..., My > 0, define the “kinetic energy” 
at time t as the quantity 


N 
ef 1 ; 
ra 5 mk 0P, (3.1.5) 
i=1 


while the “potential energy” at time t of the force F governing the motion, 
supposed conservative with potential energy function V € C% (RN), will be 
defined as 


V(t) vE a), EME) (3.1.6) 
One then notes that 
iT) = 3 mi x (t) - x (t), (3.1.7) 
dt = 
“Vi = D AN eG) x (t), (3.1.8) 
dt a aE 


hence, by Eqs. 3.1.1), 3.1.2), and 3.1.3): 


OV 


N 
“(rt +V(t) = 2 mK (mK (0+ Fey) = (319) 


Therefore the following proposition holds. 


1 Proposition. If t > x(t) = (x (t),...,x?(t)), t € I, is the motion of 
system of N points, governed by a conservative force law with potential energy 
V, there is a constant E, “total energy” of the motion, equal at all times to 
the sum of the kinetic energy and the potential energy: 


T(t)+V(t)=E, Vtel (3.1.10) 
with T(t) and V(t) defined in Eqs. (3.1.5) and (3.1.6). 


Observation. It is worth stressing that here we are meeting a first but very im- 
portant difference between one-dimensional and multi-dimensional motions: 
in the case of the motion of a single point in one dimension, every purely 
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positional force law is conservative. If d > 1 or N > 1, there are purely posi- 
tional force laws which are not conservative in the above sense. For instance, 
if N = 1, d = 2, the force law f1 (1, £2) = 0, fo(1,€2) = £ is not conservative 
since Of /0&2 4 Of2/0E1, while the two derivatives should coincide if f were 
conservative (since they would be the mixed second-order derivatives of the 
same function V). 


1. Let f be a C™(R?/{0}) function with values in R? having the form f(x) = (|x|) Ter 
p E€ C~(R+/{0}). Consider the force law for a system of N point masses given by 


. i : G) — gl) : ; 
G E 2 ele? EI) on n oy = 2 fe" = 9). 


This force law is defined for configurations such that ED # EO), Vi Æ j, and strictly speak- 
ing is, therefore, a generalization of the force law notion of Definitions 1 and 2 (requiring the 
force to be defined for every configuration (€@),...,€(%))). It will be called conservative 
if there is a function V, of class C% on the configurations with ED x E0), such that Eq. 
(3.1.4) holds. In this extended sense, show that the above force law is conservative and 


VEO, 2236) = OEM EO), 
i<j’ 
where r — (r), r > 0, is a primitive function to y: S(r) = f" (r) dr’. 
Find sufficient conditions on y so that the above force law can be extended by continuity 
to all configurations becoming a conservative force law in the sense of Definition 2. 


2. Let 8; y(r), jj! =1,..., N, j <j’, be N(N — 1) functions in C™ (0, +00). Consider the 
force law with potential energy function 


VEM,...,€) = SF 5,716 -EP 


i<j’ 


Find sufficient conditions on ® so that the force law is of class C% (R3N). 


3.2 Work. Linear and Angular Momentum 


One can wonder whether it is possible to extend the energy conservation 
theorem so that it could be applied to systems subject to nonconservative 
force laws. The answer is, in some sense, affirmative and it is known as the 
“alive forces theorem” . To formulate this simple theorem, one needs the notion 
of “work of a force” on a given motion. 


3 Definition. (i) A RIN -valued C® (RN!) function F verifying properties 
(i) and (iii) of Definition 1 will be called a “force law” for a system of N point 
masses. 

(ii) If x is a motion, defined fort € I, of a system of N point masses and if B 
is a force law for it, not necessarily coinciding with the force law generating the 
motion x fi.e. not necessarily verifying Eq. (8.1.1)], one defines the “work” 
of the force B in the time interval [t1, t2] C I as the quantity 
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N to 
Lit (8,x) “yf p® (X(t), x(t), t) dt (3.2.1) 


where, following the conventions of §3.1, we set B = (p,...,e™). 
Observations. 


(1) Let ® be a purely positional force law, i.e., V (n, €, t) E€ RN!: 


p (1, vera n), ED, aP a EN), t) = po (€%), <a EM) (3.2.2) 


i=1,...,N. Then 


N tə 
Lin.ta(®,x) = DY | PÀ (x(t)) - x (t) dt (3.2.3) 


and one recognizes in the above integral a line integral of the differential form 


N 
DORA (3.2.4) 
=t 


on the curve Z(x) described in RN by the point x(t) as t varies in [t1, t2] 
(“trajectory of x”). Formula (3.2.4) is usually read by saying that the work 
done by a force on a point which undergoes a displacement is the “scalar 
product of the force times the displacement” . 

(2) From observation (1), it follows that the work done by a purely positional 
force law ọ in a given time interval during which the system is displaced from 
the configuration x(t1) to x(t2) along a certain trajectory Z solely depends 
upon the trajectory and does not depend on the time law governing the motion 
along Z. 

(3) If ® is a conservative force with potential energy V [see Eq. 3.1.3)], the 
differential form of Eq. 3.2.4) coincides with the differential of —V: 


ot yr V8) ati 
Me). de®) = — de = — 
> P (E) da" = 3 T e (3.2.5) 
hence, from Eq. (3.2.3), it follows that 
Lt, ,t2 (®,x) = —V (x(t2)) =P V(x(t1)), (3.2.6) 


showing that the work performed in a given time interval by a conservative 
force on a motion x depends solely on the initial and final configurations of the 
motion, i.e., it is also independent of the trajectory followed by the motion. 


The “theorem of the alive forces” can now be formulated. 


2 Proposition. Let t — x(t), t € I, be a motion of a system of N points, 
with masses m,,...,my > 0, developing in RÌ under the action of a force 
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law F. 
Then the variation of the kinetic energy, or “alive force”,! between the times 
t1, te € I, is equal to the work performed in [t1,t2] by F on the motion x: 


T(t) — T(t1) = Ln (F, x) (3.2.7) 


Observation. By Eq. (3.2.6), Eq. (3.2.7) becomes the already discussed energy 
conservation theorem, Proposition 1, whenever F is conservative. 


ProoF. By Definition 1, p.142, of motion developing under the action of a 
force F = (£®,..., £0), we have 


mj XO = fO, (N), x®,..., xO, 2), (3.2.8) 
Yj =1,..., N. Multiplying both sides scalarly by x and summing over j: 


N 
SmO xO FEO xO, (3.2.9) 
j=1 


and integrating both sides with respect to t between tı and t2 one finds Eq. 
(3.2.7). mbe 


The interest of Proposition 2 lies in its generality as a consequence of Eq. 
(3.1.1). There are other immediate consequences of Eq. (3.1.1) valid under 
the additional assumption that the force law F governing the motion verifies 
the third principle of dynamics: they are the so called “cardinal equations” of 
dynamics, whose interest is also due to their great generality. 

As discussed in Chapter 1, the hypothesis that a force law F for a system 
of N point masses verifies the third law of dynamics means several things 
mathematically. First, if F = (f1,...,f®)), the function £®, j = 1,...,N, 
can be represented as 


fo) (n®, mee n, ED, Trae EON), t) 
= p(n, EO) t) a 5y £0) (n®, nD, €, ED t) (3.2.10) 
= 
where fO) € C®~(R24+1), fO) e O%(R44+1) are suitable R4-valued func- 
tions, Yi, j = 1,..., N. 
For reasons discussed in Chapter 1, the function f© is called the “external 


force” acting upon the j-th point mass and f~) is called the “force exerted 
by the i-th point on the j-th one”. Second, one assumes that 


fir) (n,n, E, Et) = £0979 (n,n, EE, t) (3.2.11) 


1 In the ancient times the alive force was actually defined to be twice the kinetic energy. 
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and, finally, a 
f°") (n,n, €,€',t) is parallel to €’—€ (3.2.12) 


Equations (3.2.10)-(3.2.12) are the analytic form taken, in our notations, 
by the third principle of dynamics for the force law F acting on the system of 
point masses under consideration (see, also, Chapter 1). 


4 Definition. A force law for a system of N point masses in R? verifies 
the third principle of dynamics if it admits a representation like Eq. (8.2.10) 
verifying Eqs. (3.2.11) and (8.2.12). In this case, the quantity 


N 
RY (n, ae WA) EO, n EM) t) = 5 POO ED, t) (3.2.13) 
i=1 


thought of as an R?-valued C% (R?IN+1) function takes the name of “total 
external force” of the force law F. If d = 3, the quantity 


N 
MỌ (n®, Eeng n, EM), sea EN), t) = se = a) ^ POMO, EO, t)) 
j=1 
(3.2.14) 
is called the “total momentum of the external forces” of F with respect to the 
point a € RÌ. 


Observation. If d 4 3, it is still possible to define the momentum of the forces 
with respect to a point: however, it cannot be naturally thought of as a vector 
in RI. To avoid complications, rather than on the shaky grounds that the 
“physical case” is d = 3, we do not deal with this question. 


The following proposition gives the so called “cardinal equations of dy- 
namics” : 


3 Proposition. Given a motion t = x(t), t € R4, of N points in RÌ, with 
masses M1,..., My > 0, define the “linear momentum” at time t and the 
“angular momentum”, with respect to a E€ RÌ, at time t as the quantities 


N N 
Q(t) Emy, Ka YE mE- a) AKM(t) (8.2.15) 
j=1 


j=1 


If the motion develops under the action of a force law F verifying the 
third principle of dynamics and if one shortens R®(X® (t), ..., x(t), t) 
as R°) (t) and, likewise, MẸ (x (t),...,xO(£), t) as MẸ (t) and [see Eqs. 
(3.2.13), (8.2.14)], then 

d 


Lat) =RO(), Kalt) = MQ), (3.2.16) 
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Observations. 

(1) Sometimes the linear momentum is called the “quantity of motion”, while 
the angular momentum is called “momentum of the quantity of motion”. The 
cardinal equations (3.2.16) show that their time variation depends only upon 
the external forces acting on the system. 

(2) The first cardinal equation in (3.2.16) is often called the “baricenter theo- 
rem” or the “center of mass theorem”. To understand the origin of this name 
associate with the motion in RSN, t > x(t) = (x (t),...,x)(#)), t € I, 
the motion in RÌ, t > xq(t), where 


2 Bete). (3.2.17) 


N: x 
The point gids Paste is called the “baricenter” and the motion t — 
i=1 Me 
xc(t), t € I, the “baricenter motion”. Setting M = yy m, (“total mass of 
the system”), the first relations in Eqs. (3.2.15) and 3.2.16) become, respec- 


tively, 


Q(t) = Mxa(t) (3.2.18) 
Mxg¢(t) = R® (t) (3.2.19) 


and Eq. (3.2.19) can be read as “the baricenter of a system of N masses moves 
as if it were a single point mass subject to the action of a force equal to the 
total external force acting on the system”. 

If the external force has the form f = mig € R3, “gravity force,” the 
point G has many other nice properties which motivate its name: they are 
discussed below. 

(3) Note that, in general, Eq. (3.2.19) is not a “closed equation”: the right- 
hand side cannot, in fact, be computed without already knowing the locations 
and the speeds of all the particles of the system. Nevertheless, there are some 
exceptional particular cases of special importance. For instance, if the external 
force acting on the j-th point is independent of its position and velocity: this 
is the case of the gravity force. 

(4) It is worth stressing that, in general, it is not true that the momentum of 
the external forces can be computed by imagining the total force as applied 
to the baricenter; i.e., as (xq — œ) \R©). Neither is it generally true that the 
derivative of the angular momentum of the baricenter, i.e., of M(xg—a@)Axa, 
is the momentum of the total external forces. 

(5) However, in the special case 


fOe = mg, (3.2.20) 


where g is a fixed vector (“gravity force”), one finds 
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N 
R® =(S>mi)g = Mg (3.2.21) 


i=l 


and by Eq. (3.2.17), 


l B 


N 
=X (ee a) ^A mjg = Sra a)) Ag 


=| (3.2.22) 
=M (xg — T T PEET T T ARSENE 
Furthermore, 
Lae — a) A Mxg = (xg — a) A M&g + XG A Mx 
dt’ ¢ Cre ee ee g (3.2.23) 


(xg — a) A Mže = (xg — a) AR = M); 


i.e., in the case of the gravity forces, the most daring thoughts are allowed: 
Eqs. (3.2.22) and (3.2.23) show the uniqueness of the gravity force case with 
respect to the cardinal equations and they explain why the point defined in 
Eq. (3.2.17) is given the name of “center of gravity”, or “center of mass” or 
“baricenter” . 


PROOF. From Eq. (3.2.8), by summing both sides over j, it follows that 


X mj #4 =X fO (3.2.24) 


but Eqs. (3.2.10) and (3.2.11) and the first of Eqs. (3.2.15) imply ya my; 9) 
= R©), i.e., the first of Eqs. (3.2.16). Similarly, by externally multiplying both 
sides of Eq. (3.2.8) by (x(t) — a), a € R3, and summing: 


N 
So mj (x (t) — a) AX (t) 


p R (3.2.25) 
= Soe (6) —a) Af = X D = a) ALD? =M 
j=1 j=1 


having used Eqs. (3.2.10) and (3.2.11) and, particularly, Eq. (3.2.12) in the 
third step to eliminate the contribution of the internal forces: 
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N N 
Se? (t)-—a)A 2 fir) = So (x) (t)-—a)A fd) 
ioe = ižj 
tAj 
= 5 { (x! (t) — a) A fË) + (x(t) — a) A p020) (3.2.26) 
iFj 
: 5 { (x(t) — x (t)) A Nae =0 
tFj 


because £79) = —£0>9 and £079) is parallel to x (t)—x) (t), by the third 
principle. Furthermore, 


= —{ (x(t) — a) Ax (t)} — {a(x (t) — a) } Ax (4) 
r dt (3.2.27) 
== (x) (t) —a)A xD (t)} = xO (t) A x) (t) 
= £ (x(t) — a) Ax (t)} Hence, 
N l a N l d 
Y mP A- a) AR (t) =A Ot- a) AK) (t) = Ka (3.2.28) 
= dt 4 dt 
which, together with Eq. (3.2.25), proves the second equation in (3.2.16). 
mbe 


1. In Appendix P, there is a table of the masses of the nine main planets and of their 
distance from the Sun. The mass and radius of the Sun can also be found there. Find the 
configuration of the planets in which the center of mass of the above ten heavenly bodies 
is farthest from the center of the Sun and compute the ratio of this distance to the Sun 
radius. (Assume that the planets move in circular orbits around the Sun.) 


2. Same as Problem 1, not counting the Sun. 


3. From the data in Appendix P, find the position of the Earth-Moon center of mass relative 
to the Earth and compare its distance from the center of the Earth with the Earth radius. 
(Assume the distance between the Earth and Moon to be equal to the maximal or to the 
minimal distance.) 


4. Find the value of the angular momentum of the Earth-Moon system with respect to the 
center of the Sun, assuming that the latter is fixed in a reference frame with axes fixed with 
the fixed stars. Assume also that the configuration Moon-Earth-Sun is that of a full lunar 
eclipse and neglect the orbital inclination of the Moon. Should the angular momentum be 
time independent? If not, indicate what should be neglected to make it time independent. 
(Hint: The attraction of the Sun on the Earth and on the Moon has vanishing momentum 
with respect to the center of the Sun, while the Sun-Moon forces are internal forces to the 
Earth-Moon system.) 


5. If V € C®(RN4) is bounded below the force law F = (f,...,£(%)) with © 2 — 


(2) 
YEO, i = 1,..., N, is actually a force law in the sense of Definition 3, (i). Show the 
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validity of this statement. (Hint: One need only check condition (iii) of Definition 1. This 
is obtained by finding an a priori estimate in the sense of §2.5, using energy conservation. 


Proceed along the lines of the analogous one-dimensional case, §2.5, Proposition 6, p.29). 


3.3 The Least Action Principle 


The least action principle seen in §2.24 can be extended to systems of N points 
in R4 subject to conservative forces. Consider the following definition. 


5. Definition. 
(i) Let Mi to (€1, €2) be the set of motions t > x(t) = (x@(#),...x)(t)) € 
C® (fti, t2]) such that x(t1) = £1, x(t2) = €2, with £1, £2, x(t) E RYT. 
(ii) fx e M C Mau t5(€1,€2), the Vx(M) will denote the space of the 
“variations” of the motion x in M: it is the set of the RN?-valued functions 
y € C™ (ti, t2] x (—1,1)), (t,£) — y(t,£) such that: 

(a) y(t, 0) = x(t), Vt € |ti, t2] (3.3.1) 


(b) y(ti,€) = ae y(t2,€) = £2, Vee (ad, 1) (3.3.2) 
(c) for alle € (—1,1), the function t > y(t) = y(t,e), t€ [t1, te] (3.3.3) 
is a motion ye E M. We shall set Vx( Mi tə (€1, €2)) = Vx 
(iii) If L € C®(R?NItI) is a real-valued function, define the action with 
Lagrangian density L as the real-valued function A on Mi to (€1, €2): 


t2 
Ax) E | L(x(t), x(t), t) dt (3.3.4) 
ti 
(iv) The action A in Eq. (3.3.4) is said to be stationary or locally minimal 
the motion x E M C Ma to (€1, 2) if the function € > Aly), € € (—1, 1), is 
stationary or locally minimal for € = 0 and for all y E€ Vx(M). 


The stationarity condition of Eq. (3.3.4) on all of Mz, 1.(€1, 62) in x is 
deduced exactly along the same lines and patterns followed to prove the anal- 
ogous condition seen in Proposition 36, §2.24, p.130, through the principle 
of the vanishing integrals (Appendix D). Therefore, the detailed proof of the 
following proposition is left to the reader. 


4 Proposition. A motion x E€ Mb, 1.(€1,€2) is a stationary point in 
Mita (1,2) for the action of Eq. (3.3.4) if and only if 


Z(E lO) x(t),2) = ar k(t) x(t), t) (3.3.5) 


for allt € |ti, t2] and for alli =1,2,...,N. 


Observations. 
(1) In Eq. (3.3.5) we use the notation on the derivatives introduced in §3.1. 
(2) Often, with an abuse of notation, Eq. (3.3.5) is compactly written as 
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ddL ƏL 
ee (3.3.6) 


An immediate corollary to Proposition 4 is the following. 


5 Proposition. Given a real-valued C®(RN*) function V, bounded from be- 
low,” the motion x E€ Mz, 1, (€1,€2) makes the action with Lagrangian density 


N 
L£(n,€,t) = som nÈ? —V(é) (3.3.7) 
t=1 


mj >0,j=1,...,N, stationary on Mi 1,(€1, £2) if and only if x is a motion 
of N points in RÌ with masses m1,..., mn > 0 which, fort € |ti, t2], develops 
subject to influence of the force law F with potential energy V. 


PROOF. It is enough to substitute Eq. (3.3.7) into Eq. (3.3.5) to see that, in 
this case, Eq. (3.3.5) becomes Eq. (3.1.1) with F given by Eq. (3.1.3), i.e., 


ej OV 
m; x(t) = peor = 1,- N (3.3.8) 
if x(t) = (xP (t), ..., xX (£), t € [ti, te]. mbe 


The following generalization of Proposition 38, §2.24, is also valid, but the 
proof is left as a problem for the reader (since it is an essentially a word-by- 
word repetition of that of Proposition 38, p.132). 


6 Proposition. Let t > x(t), t E€ R4, be a motion of a system of N points 
in Rt, with masses mı,...,my > 0, developing under the action of a conser- 
vative force with potential energy V € C?(RN®@). Given tı € Ry and tz > tı, 
if tg — tı, is small enough, the motion t — x(t) considered in the time inter- 
val |ti, t2] is a point of local minimum in Mi t (£1, E2) for the action with 
Lagrangian (3.3.7). 


The comments seen at the end of Chapter 2, pp.133-135, extend to the 
contents of this section. It is, in particular, quite important that the reader 
extends to the case of a system of point masses the observations made in §2.24, 
concerning the representations of motions in coordinates other than Cartesian 
coordinates and concerning the invariance of the Lagrange equations (3.3.6) 
with respect to changes in coordinates (see §2.24, p.133 and following). 

In the following sections and in their exercises, we shall see some interest- 
ing applications of the “Lagrangian formulation” (3.3.6) of the equations of 
motion as a “change of coordinates invariant” formulation of such equations. 
Among these will be the theory of perfect constraints. 


? See Problem 5, §3.2. 
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3.4 Introduction to the Constrained Motion Theory 


Elli avien cappe con cappucci bassi 
Dinanzi a li occhi, fatte della taglia 

Che in Clugnì per li monaci fassi. 

Di fuor dorate son, si ch’elli abbaglia; 
Ma dentro tutte piombo, e gravi tanto... 


The principle of least action inspires the following somewhat trivial con- 
siderations. Let x — A(x) be the action of Eqs. (3.3.4) and (3.3.7) defined 
on the motions in M+ ta (£1, £2) of a system of N point masses subject to a 
conservative force F. 

Suppose a priori known that the force law is such that the motion x that 
develops under its influence from €; to 2, within times tı and t2, verifies some 
properties like |x(t)| < S or |(t)| < P or |x“ (¢)| = 0, etc. Then it is clear 
that the research of x in Mz, ‚ta (€1,€2) can be restricted to the subset M, of 
the motions in Mz, +, (€1, €2) verifying the properties under consideration. 

Very often it happens that a system of point masses is subject lo “con- 
straints”, i.e., to force laws that allow only a “few” motions among those a 
priori possible, at least for vast classes of initial data. Think of a point mass 
constrained to remain on a surface: in this case, the surface acts on the point 
with a force systematically such as to forbid the abandonment of the surface 
itself by the point, whenever the initial data (7, €) have € on the surface and 
7 tangent to it. 

Think, also, of a rigid system of N points. Now the i-th point will exert on 
the j-th point a force f(~2) systematically such that the two points remain 
at a fixed distance from each other. 

By taking into account the constraints, the allowable motions in Mz, 2, (&1, 
€2) will generally be parameterizable with £ coordinates, and often £ « Nd; 
consequently, it will be possible to imagine a description of the motions in 
terms of @ functions of time. Therefore, the Lagrangian and the action will 
also be expressible in terms of the same £ functions, and the action of a motion 
x allowed by the constraints will take the form 


A(x) = f uÈ taal (3.4.1) 


1 


if t > (aı(t),...,ae(t)), t € [t1, te], is the description of the motion x in the £ 
“essential coordinates” . 


3 In basic English: 
They had capes with low hoods 
in front of the eyes, made in the fashion 
that in Cluny is used for the monks. 
Golden they are outside, so that they dazzle 
but inside they are all leaden and heavy a lot ... 
(Dante, Inferno, Canto XXIII). 
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To be less vague, assume that there are N R4-valued functions in C% (R£): 


a = (ay,...,0¢) > XË (a) = KX (ay,...,0%), (3.4.2) 


i=1,...,N, such that the set of the motions t > x(t) = (x (t),...,x)(#)) 
t € [t1,t2], which are “constrained” or “allowed” by the constraints is sim- 
ply the set of the motions which is the image of the motions in R! via the 
transformation (3.4.2). Thus, given a motion t — a(t), t € [t1, t2], in RE one 
describes, via Eq. (3.4.2), the constrained motion t —> x(t), t € [t1, t2], where 


x(t) = X@(at)), i=1,2,...,N (3.4.3) 


which we shorten as x(t) = X (a(t)). 

In other words, let us admit that the conservative force law F for the 
system of N point masses under consideration is such that the motions in 
Ma, to (€1,€2) that can actually develop under its influence starting from a 
given class of initial data are necessarily contained in the class of the motions 
having the form of Eq. (3.4.3) with a € M+, .4,(a1, a2), where a,a2 € R! 
and X(a1) = ĉi, X(@2) = &o. 

If x is a constrained motion in the sense just discussed, its action, Eq. 
(3.3.4), with respect to the Lagrangian (3.3.7), where V is the potential energy 
of F, can be written as in Eq. (3.4.1) if £ € C~(R?*") is the function 


ieee ax» 
(Bis. Be, aa,---,00,t) = 5 mil >) —— 6;) —V(X(a)), (3.4.4) 
because x(t) can be computed, by differentiating Eq. (3.4.3), as 


Saxe 
LO) = on (a(t)) a;(t),  j=1,2,...,N, (3.4.5) 


j=1 


whenever x is the constrained motion image of a: x = X (a). 

Hence, if x € Mz, 4,(€1,€2) is the motion that actually develops under 
the influence of the force F and if x is the image via Eq. (3.4.3) of a, then 
the action A with Lagrangian (3.3.7) is stationary in Mz, 1.(€1,€2) on x, 
while the action A with Lagrangian given by Eq. (3.4.4) is stationary on a in 
Ma, ,t.(Q1, @2). This property is an immediate consequence of the fact that 
if A is stationary on a motion x in it is also stationary on x in any smaller 
set M’ C Mz, 4.(€1, €2) provided x € M’. In our case, through Eq. (3.4.3), 
M’ would be the set of the motions which is the image of the motions in 
Ma, t2 (a, a2). 

By Proposition 4, §3.3, the stationarity condition for A, i.e., for the action 
on Mz, 1,(@1, @2) with Lagrangian density (3.4.4), is 
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d aL. aL 
Hag, OM) a), 4) = ða 


(3.4.6) 


i=1,2,..., 2, Yt E€ |ti, tə]. 

The importance of the above considerations is easily realized: Eq. (3.4.6) 
is already the equation of motion after the elimination of the parameters 
describing the system, necessary a priori but made “useless” or “redundant” 
by the presence of the constraints which allow one to reduce the number 
of the coordinates needed to describe the actually “possible” configurations, 
from Nd down to £ via (3.4.2) and (3.4.3). 

Therefore, the idea occurs that the mechanism for the elimination of the 
redundant coordinates in conservative systems subject to simple constraints, 
like Eqs. (3.4.2) and (3.4.3), might be particularly simple: it will be enough 
to rewrite the Lagrangian density of the action only in terms of the essential 
coordinates through Eq. (3.4.2) and, then, deduce Eq. (3.4.6). 

However, the principle of conservation of difficulties makes it clear that 
there must be some serious obstacle to the actual applications of such a shining 
but simplistic vision. 

The true constraints are, in fact, generated by forces that, as we shall see 
shortly, generally are neither simple nor conservative (in the sense of Definition 
2, p.142, §3.1) but depend on the velocities of the points as well as on their 
positions. 

In such situations, the above considerations become essentially useless 
since they are not applicable to the simplest and most interesting motions 
constrained in the sense that they are parameterizable as in Eqs. (3.4.2) and 
(3.4.3), by £ coordinates. 

To understand better what has just been said, let us consider the case 
of a point constrained to stay on a curve I’ C RÌ with intrinsic parametric 
equations given by 


s > &(s), SER (3.4.7) 


where s is the curvilinear abscissa on T (which will be supposed to be a simple 
curve, i.e., without double points and open). Assume that the curve I’ exerts 
a force on the point mass which keeps it on J’ for all motions starting from 
initial data (7, £) with € = (so), n = Æ (50) 80 (i.e., with € € I and 7 tangent 
to it), with (so, 80) € R?. 

If r(s),n(s) denote, respectively, the tangent and the principal normal 
versors to I’ at the point with curvilinear abscissa s and if r(s) denotes the 
curvature radius at the same point, it is well known that 

d&(s) n(s) _ dr(s) 
zia = 3.4.8 
(s) ds ° r(s) ds ( ) 
Then if t > s(t),t € R, is a motion on T described by the time variation of 
the curvilinear abscissa, we find 
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<&(s(t)) = a(t) (sC) (3.4.9) 


and 


d? .. s(t)? 

gas (st) = 8(t) T(s(t)) + —~~ n(s(t)). (3.4.10) 
If the point is subject to a force which is the sum of the constraint reaction 
R(é, s) and of an external force f(s), then 


mx=f+R (3.4.11) 


if m > 0 is the mass and x(t) = €(s(t)) denotes the motion in R3. 
By Eq. (3.4.10), Eq. (3.4.11) becomes 


32 


r 


and from the second equation, it follows that the normal component of the 
constraint reaction is 


62 
R-n= ay — f(s) - n(s) (3.4.13) 
at the point of I’ with coordinate s when it is occupied by a mass m with 
speed along I” given by å. 

From Eq. (3.4.13), one sees that R(s,s) is necessarily å dependent if 0 < 
r(s) < +00, as will be supposed, and therefore the constraint reaction cannot 
be conservative in the very restrictive sense of §3.1. 

Nevertheless, the essence of the idea which arose in connection with Eq. 
(3.4.6) will be saved: it will, however, be necessary to go through a long 
analysis which, as is to be expected, involves a deeper physico-mathematical 
discussion of the notion of constraint. Such a discussion will be aimed at 
clarifying the definition of constraint, i.e., the physical phenomenon mathe- 
matically modeled as a “constraint” . 

In the next section a general mathematical definition of constraint will be 
presented, stressing its main mathematical properties and delaying until the 
later sections a deeper discussion showing how the empirical notion of a fric- 
tionless constraint is naturally schematized by the introduced mathematical 
structures. 


3.4.1 Exercises 


1. Let I be a circle in R with radius r. Find r(s),n(s),7(s) [see Eq. (3.4.8)]. 


2. Let I be an ellipse with equations z = 0, x? /a? +y? /b? = 1, a,b > 0. Find r(s), n(s), T(s), 
at the point (x, y, 0). 


3. Show that the force law R(x, x) = -a7 x, (X, X) € R? x R? produces a constraint for 
the motions of a point with mass m > 0 with initial data (n, €) with n- €= 0, |€| = r. The 
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constraint is to the circle IT = {€ € R?, |é] = r}. (Hint: Show that the circular uniform 
motion verifies the equations of motions and use the uniqueness theorem.) 


4. Same as Problem 3 in R3, replacing the circle I with the surface of a sphere. 


5. Same as Problem 3, using Archimedes’ spiral (with equations ọ = a0, a > 0, in polar 
coordinates), finding an appropriate force R producing a constraint to the spiral. 


6. Find an appropriate force R producing a constraint to I’, as defined in Problems 3-5, if 
the point mass is also subject to a conservative force with potential energy V = Ex?, k > 0. 


7. Show that no purely positional force law R can force every motion with initial data (7, £) 
with n: £ = 0, |€| = 1, to move on the unit circle in R?, regardless of the mass m of the 
point. (Hint: Let (no, o) be an initial datum at t = 0 producing a motion which stays on 
the circle. Consider the motion with initial datum (270, £0) and show that it must abandon 
the unit circle, for t > 0 and small, by using the Lagrange-Taylor theorem or, alternatively, 
by using Eq. (3.4.13).) 
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The following is a rather general mathematical definition of a constrained 
motion for a system of N point masses. 


6 Definition. Given s real-valued C®(R?N%*")-functions p,...,~) we 
shall say that a system of N points, with masses mı ..., my > 0, subject to 
a force law F is constrained by the constraints y,..., if F is such that 
the motions t > x(t) = (x (t),...,x(t)), t © Ry, developing under its 
influence identically verify the s relations, i =1,...,8: 


p(x (4), MM (), xO (a), xO (4), t) = 0 (3.5.1) 
Vt E€ R4}, provided there is a time t (e.g., t =0) when Eq. (3.5.1) holds. 


Examples 
(1) If V € C®(RN4) and E E€ R, the function 


BO GOD) Sexe E T O 


N 
7 5 min)” -v(6®,..., 6M) -E ee 
i=1 

is a constraint for the motions of a system of N point masses with masses 
mi,...,m” > 0 subject to a force law with V as a potential energy. 

(2) Given a system of N points, with masses m1, ..., mpy > 0 subject to a force 
law F verifying the third principle of dynamics and with zero external forces, 
let q,m € RË. Define the six functions on R?N4d+1 (actually independent on 
the last coordinate): 
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N 
(MM (0),-... (0, EPO, EMA, = Somimi — a, 
i=1 


N 
(Nt), OO), EME) = So mi € Ag =m, 


i=l 
(3.5.3) 

then the above six functions provide six constraints for the system. 

(3) More generally, every conservation law may be interpreted as a constraint. 

(4) The above examples may be pushed to the extremes: given (no, £o) € R?N4 

and calling S+, the evolution flow associated with a time independent force 

law F acting on a system of N point masses, the 2Nd functions: 


p(n, &,t) = Si(n, £) = (no, £o) (3.5.4) 


are constraints for the system. 
(5) Consider a point with mass m > 0 in RÌ subject to a force law given by 
2 
mg 
F(n,€) = —m—= (3.5.5) 


ET 


where r > 0 is constant. Then the following two functions: 


p(n, €) = (E -rP tn- EP, pln, ) =E tn (3.5.6) 


are constraints for the system (see Problem 3, §3.4). are constraints for the 
system. 


Observation. The above examples of constraints may leave the reader a bit 
perplexed, particularly Example 4. In some sense it shows that all the motions 
can be considered as constrained motions. 

It will be seen that the constraints become interesting only when they can 
actually be “constructed”, so that they can be used to reduce the number 
of degrees of freedom, or of parameters, necessary to describe the motions. A 
constraint of the type in the Example 4 is of little use in practice since it can be 
constructed only when all the motions of the system are perfectly understood 
(i.e., when S; is a “known transformation” ). However, this is usually the aim 
of the theory and it cannot be considered as a starting point. 


Particularly interesting are the velocity-independent and time-independent 
constraints. 


7 Definition. In the context of Definition 6, assume that there exist s real- 
valued functions in C@(RN4), p®,...,p"), such that, Vi=1,...,8, 


po (n, ay 7), EO, os, EN), t) == py (€%), ae EON) (3.5.7) 
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for all (n,€,t) € R?N4*!. We shall say that the system is “subject to s 
holonomous constraints p™),...,p°)”.4 We shall denote Mi, +. (€1, €2| 9, 
ia yl)) the subset of the motions in Mt, t (€1, €2) consisting of the motions 
xE Mt tə (ĉi, £2) such that 


yp (x(t)) = 0 (3.5.8) 


This set will be called the set of the motions “subject to” or “compatible with” 
the constraints pY,..., yp). 

Finally, if €.n € RN4, we shall say that € is a configuration “compatible 
with the constraints” if p(€) = 0, j = 1,...,8, and that n is a velocity 
“compatible with the constraints in €” if there is a motion t —> x(t), defined 
for t near zero, such that x(0) = €,x(0) = 7 and x(t) is compatible with the 
constraints for all t. 


Observations. 

(1) By the assumed time invariance of the constraints [see Eq. (3.5.7)], the 
choice of time t = 0 in the last part of Definition 7 has no special meaning. 
2) In Problem 2 at the end of this section, we mention that when the vectors 


) ; : 
oe, j= 1,...,s, are s linearly independent vectors in R%, the constraint 
compatibility condition for a velocity 7 can be analytically expressed as 


ð 
eE =o, Piega (3.5.9) 


which has a clear geometrical meaning. 

(3) Given € € R4 compatible with the constraints, the set of the velocity 
vectors 7 compatible with the constraints in € is always nonempty since it 
contains 7 = 0. 


Our first task will now be to set up a precise definition of a “perfect 
holonomous constraint”. A possible definition is inspired by Eq. (3.4.12): in 
that case, the constraint to the line I’ is naturally called “ideal” if R-7 = 0, 
i.e., if the “only effect” of the constraint is to keep the motion on I’; in fact, 
the equation of motion simply becomes 


m &(t) = f(s) - T(s) (3.5.10) 


which can be read “the acceleration along I’ is proportional to the projec- 
tion on I’ of the active force”, i.e., of the part of the force distinct from the 
constraint reaction. 

The relation R -r = 0 means that the reaction acts orthogonally to I. 
However, it is not immediately clear what should be meant by the reaction 
being orthogonal to the constraint in the case of the general constraints con- 
sidered in Definition 7. 


* Holonomous simply means “depending on the site”. 
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After some thought, the following notion appears natural: the constraint 
reaction or, more generally, a force law R(n, €) acting on a system of N points 
masses in R? occupying the configuration € with velocity n (both constraint 
compatible), is “orthogonal to the constraint” if, calling 7’ the velocity of any 
other constraint compatible motion at the time when it occupies the configu- 
ration €, it is 


R(n, £): n = 0, (3.5.11) 


which, more explicitly, is 


Son! RO (n, €) = 0 (3.5.12) 


One could argue and debate about this extension. However, in this section we 
shall first investigate its mathematical meaning, delaying the discussion of its 
deep and interesting physical interpretation until later on. Let us therefore 
establish the following definition. 


8 Definition. Let F be a time-independent force law for a system of N points 
in RÌ. Assume that F produces s holonomous constraints py), gto’ 
Given a positional force law F € C®(RN9) for the system, we define the 
“constraints reaction” with respect to the “active force” F\ as the quantity 
R = F—F). Furthermore, we shall say that the system of constraints is 
“ideal” with respect to the pair (R,F™) if for all € € RN? compatible with 
the constraints, i.e., such that pP (£) = 0, Vj (see Definition 7), it is 


R(m, €) > 72 = 0 (3.5.13) 


for all choices of constraint compatible velocity vectors nı, n2 (in the sense 
of Definition 7). We shall refer to this situation by using the shortened locu- 
tion “the system of point masses is subject to the active force F“ and to s 


holonomous ideal constraints p™,...,p). 

Observations. 

(1) Therefore, the last sentence means that the system is subject to a time- 
independent force law F producing the constraints p),...,y"). which are 


ideal with respect to the active force F and to the “reaction” R = F- F. 
Strictly speaking, the last sentence of Definition 8 should be subject to a 
consistency check: in terms of the information contained in it, it should be 
possible to reconstruct the equations of motion at least as far as the con- 
strained motions are concerned; i.e., given F™ and the constraints it should 
be possible to reconstruct F(7, €) for all constraint compatible (7, €). This is 
actually possible and, basically, it is the content of Proposition 8 (below) and 
of the first observation to it (see, also, Problem 2 at the end of this section). 
(2) It is important to stress that the decomposition F“) + R of the force as 
a sum of an “active force” and of an “ideal constraint reaction” is certainly 
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not unique, if it exists at all. For instance, if F is a conservative force field for 
our system whose potential energy V € C°(R%) is constant on the region 
of RNI, where 


ph (E) =... = yp) (£) =0, (3.5.14) 


then the decomposition 


F = (FO +F)+(R-F) (3.5.15) 


can be shown to be another decomposition of F into an “active” part FO +F 
and into a “reaction” R’ = R — F verifying Eq. (3.5.13). 

In fact, if t — x(t), t € R, is a constraints compatible motion passing through 
€ at t = 0 with speed mo, it will be V(x(t)) = constant; hence, 


ov (E) ~ 


EVID ss SONS) eg je P= FE) m (3516) 


= l OE 


(3) The ambiguity seen in observation 2 has a physical interpretation: it is 
generally ambiguous to talk about the constraints reactions before having 
specified which are the other forces “not due to the constraints”. Think of a 
point constrained to glide on a horizontal plane: we can always look at it as 
if it were subject to a force orthogonal to the plane and of arbitrary intensity 
G, besides the vertical downward gravity force mg. The point will not change 
its motion, at least in absence of friction, but the reaction of the table will be 
mg upwards in the first case and mg + G upwards in the second. 

(4) Hence, on the basis of the above definition of ideality, the ideality of a 
system of constraints depends on the choice of F: only once both F and 
F™ are given it is possible to define R and check Eq. (3.5.13). Therefore, the 
ideality of a constraint is not a property that can be described only in terms 
of the total force F producing it. 

Translating into a mathematical model concrete problems, it often happens 
that one is given the constraints equations y™,..., (°. and, separately, the 
active forces F and the “reaction of the constraints” R. In fact, in applica- 
tions it is often possible to distinguish operationally between the forces due to 
the constraints (“constraint reactions”) and those due to other causes (“active 
forces”). In such cases, R is a priori given, or at least some of its properties 
are a priori given. 

(5) Equation (3.5.13) is often called the “symbolic equation of dynamics” 
or “D’Alembert’s principle” or “virtual works principle”. The last name is 
usually given to Eq. (3.5.13) in its applications to statics where it is considered 
with 7, = 0 (see, also, the next comment and Observation 2, p.164, and the 
concluding remarks, p.241). 

(6) Equation (3.5.13) is also read “the virtual work of an ideal constraint 
reaction always vanishes”. This is perhaps the most suggestive way of reading 
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this equation since it stresses the fact that the velocity vector 72 is not the 
same as that, 7, provoking the reaction in &. It is, in fact, the velocity of 
another possible motion through € (“virtual motion”). The word “work” is 
naturally a reference to the fact that R(m,€)- n2 is the work per unit time 
that the constraint reaction to the motion x passing at a given time through € 
with speed nı performs on another motion passing, at the same time, through 
E with speed 72. 


In the upcoming sections, we will analyze the physical meaning of Defini- 
tion 8, i.e., we shall discuss the physical circumstances in which it becomes 
a relevant definition. Before that analysis, let us examine some remarkable 
consequences of this definition. 

The first consequence is the following proposition: the “theorem of energy 
conservation for ideally constrained systems”. 


7 Proposition. Let t — x(t), t € I, be a motion of a system of N point 
masses in RÌ with masses m1,...,my > 0 subject to a system of s ideal 
holonomous constraints p,...,) and to a conservative active force F© 
with potential energy V® € C®(RN4). Assume that x respects the con- 
straints. Then there is a constant E such that 


T()+ VO) =F, tel, (3.5.17) 


where V(t) = V (x(t)) and T(t) is the kinetic energy of the motion x at 
time t. 


Observation. The main point is that the above proposition does not assume 
that the reaction of the constraint is conservative in the sense of §3.1, but 
“only” that it is ideal, i.e., that it verifies Eq. (3.5.13). It can be velocity 
dependent, for instance. 


PROOF. It is an immediate consequence of the theorem of alive forces, propo- 
sition 2, §3.2, p.145, that the variation of kinetic energy between two times 
tı, and tz is equal to the sum of the work performed on the motion x by the 
force F, i.e., V (t1) — V(®) (t2), and by the reaction R, given by 


N to 
>| RO (x(t), x(t)) dt (3.5.18) 


However, by assumption, the motion x respects the constraints and, also, Eq. 
(3.5.13) holds. Using Eq. (3.5.13) with € = x(t),m = n2 = x(t), we see that 
the work in Eq. (3.5.18) vanishes; hence, T (t2) — T(t1) = V® (t1) — V™ (t2), 
implying Eq. (3.5.17). 

mbe 


Far more interesting is the following proposition: the “least-action principle 
for ideally constrained systems”. 
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8 Proposition. Consider N points in RÌ, with masses m,,...,mn > 0, 
subject to s holonomous ideal constraints p,...,p"°) and to the active force 
F), conservative, with potential energy V@ € C°(RN®). Denote by R the 
constraint reaction. The action with Lagrangian 


N 
= Tin G)? _ yla) 
L(n, D=) a vg) (3.5.19) 
j=l 
is stationary in Mt ta (£1, €2| 9 aie yl D on the motions which are gener- 


ated by the force FO) +R=F. 

Furthermore, let t — a(t),t E€ R4, be a motion of the system develop- 
ing under the action of the force F and respecting the constraints. Given 
tı E€ R4, there exists t > tı such that if t2 € [t1,#], the action with 
Lagrangian (3.5.19) is locally minimal in Ma t, (x(t1), x(t2)|~™,..., e) 
on the motion x observed for t € [ti,t2] and thought of as an element of 


May,ta(x(t1), x(t2) |9™,..., ©). 


Observations. (1) The importance of the above proposition lies in the fact 
that, if wisely used, it allows one to “eliminate” the degrees of freedom which 
are redundant because of the constraints. Suppose that one is able to find N 
C@ functions on R! taking values in R4: 


a = (a1,...,a¢) > XO (a) = K(ay,..., a2), (3.5.20) 
i=1,...,N, such that, Va € Rf: 


py) (X(a))=0, jH=l,...,s (3.5.21) 


i.e., such that the image of Rf via the transformation (3.5.20) is a subset of 
RN d which automatically “verifies the constraints” .° 

Also, suppose one knows that the motion R € Mz, 4, (€1, €2| e,..., pe) 
that we are studying and which develops under the action of the force F, is 
the image in RN? of a motion a E€ Ma +,(a(t1), a(tz)) in R! via the transfor- 
mation (3.5.20). 

The above assumptions mean that we have a good understanding of the struc- 
ture of the constraint so that we can find an explicit parametric representation 
of a class of configurations satisfying it. 

Then the action A(x) with Lagrangian (3.5.19) can be computed on the 
motions x € Mz, 4.(€1,é2|y,...,e) that are images via Eq.(3.5.21) 
of motions a € M:,4,(A(t1),a(t2)) as A(a), where A is the action on 
Me, ta (altı), a(t2)) with (t-independent) Lagrangian 


° For instance, in the case of the point constrained on a line (§3.4), one can take £ = 1 
and a — z(a) = (a) , a E€ R, and the parameter a has the meaning of a curvilinear 
abscissa on the curve. 
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Li 7 4 (> IRMA) 42 ylede 3.5.22 

Bye)= FOF, Pe) (X(a)) (3.5.22) 

[see, also, §3.4, Eq. (3.4.5), where this is derived]. Since by Proposition 8 A 

is stationary in R E€ Mz, t (€1,€2|9,..., p), then A is stationary in @ on 

the entire set M+ 1, (a(t), a(t2)) and, therefore, by Proposition 4, §3.3, this 
means that 


© (Fale) ale) = (o,ao) (3.5.23) 
fori = 1,2,....L and t € [ty, t2], 

Equation (3.5.23) provide £ equations for the £ unknown functions t — 
ailt), i = 1,2,...,8,t € [t1,te]. These are the equations of motion for the 
essential coordinates once the degrees of freedom which have become inessen- 
tial because of the constraints have been eliminated. 

It is of fundamental importance to realize the difference between the consider- 
ations of this section and those, apparently alike, of §3.4, pp.153-154. Those, 
in fact, had been developed assuming that the force F was conservative in 
the sense of §3.1. In the present case, as the example of §3.4 p.156 shows, the 
force will generally be velocity dependent. After a few exercises the reader will 
understand how great a simplification Eq. (3.5.23) implies in the deduction of 
the equations of motion, if compared to the alternative procedure of writing 
the equations of motions in the ordinary Cartesian coordinates followed by 
the elimination of the constraint reactions [remarkably absent in Eq. (3.5.23)| 
and of the redundant coordinates. In many instances, for example think of a 
rigid body, N can be large but £ very small. 

(2) It is convenient to say a few words to explain why the name “principle” 
is granted to the Proposition 8 as well as to several other propositions or 
definitions already met (D’Alembert’s principle, virtual work principle, etc.). 
Such names have interesting historical origins: the reader should not believe 
that the discussion of the laws of mechanics and the treatment of all the me- 
chanical problems by the application of the equation f = ma, together with 
the two other laws of dynamics, to the point masses into which a system can 
be ideally decomposed has always been obvious and natural since the work of 
Newton. 

As already remarked, Newton himself did not arrive in a very clear way to such 
a conclusion. For instance, in his study of rigid motions he had recourse to 
arguments quite different from modern methods based on the cardinal equa- 
tions (i.e., on Newton’s laws). 

Both before and after Newton, philosophers were accustomed to studying me- 
chanical problems on the basis of special assumptions, “principles”, which 
were deduced by them through more or less general considerations often a bit 
obscure. 

Newton’s principles can be thought of as belonging to the above class of princi- 
ples, and, initially, they were used particularly in the theory of heavenly bodies 
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motions. Together with the three principles first formulated by Newton, there 
already existed, more or less clearly formulated at least in particular cases, 
the energy conservation principle for simple systems (Huygens), the principle 
of the linear momentum conservation (going back at least to Descartes), the 
virtual works principle (which was used in the solution of problems in stat- 
ics by Del Monte, Galilei, Stevin, etc.), the inertia principle (Galilei), and to 
these principles many others can be added: they were invented, even in years 
following Newton’s time, to treat complex mechanical problems. 

With Euler’s work, the synthesis of all the different principles began through 
the realization that they could all be unified and deduced from Newton’s and, 
what is often not sufficiently clearly stated, equivalent to Newton’s if suit- 
ably interpreted (probably beyond the intentions and meanings the inventors 
attributed to them) (see, also, the concluding remarks to Chapter 3, p.241). 


Let us go back to the simple proof of Proposition 8. 


PROOF. Let x E€ Mj, 1.(€1,€2|~,...,~) = M, be a motion developing 
under the influence of F = F“ +R. The action of x with respect to the 
Lagrangian (3.5.19) is 


to N dis 
A(x) = J {D TEERDE)? -VO lt) } a. (3.5.24) 


Let y € Vx(M); let us compute the derivative with respect to € of the 
function A(y-) in € = 0. If we set (see Definition 5, §3.3) z(t) = Oye(10) 
= (2) (t),...,2(#)), t € [t1, t2], we have z(t) = z(t2) = 0 and 


d 4 2 av) | 
T4) -f D {mP )-2 (t)— pew & x(t))-2 (£) } dt. (3.5.25) 
al; j= 1 


By integrating the terms containing 2“) (t) by parts and using z(t1) = z(t2) = 
0, one deduces, as usual, 


d gO? l 
one . z0) 
Z Aly: -- f" 2 m3 + Baa ()} z(t) dt. (3.5.26) 
The equations of motion for x are, by assumption, 
aj av) nae 
mjž (t) = -gen 4) + RO (x(t), x(t)); (3.5.27) 


hence, we cannot conclude that the right-hand side of Eq. (3.5.26) vanishes, 
but only that it is equal to 
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d ee 
GAs) leno z a 2 RO ESO - 2) (t) dt. (3.5.28) 


However, Eq. (3.5.13) will allow us to infer that, Vr € (t1, t2): 


N 
De RY (x(r),x(7)) -2 (r) = 0 (3.5.29) 


if we show that (x(t),z(t)) are a position-velocity pair compatible with the 
constraints; i.e., if we show the existence of a motion defined for t near T and 
constraint compatible, which at t = 7 is in x(t) with velocity z(t). 

Recalling the definition of z, one sees that such a motion indeed exists. To 
build it, one simply defines t > y(r,t — T) for t—7 € (—1, 1), i.e., for t close 
to r. This function of t has, for t = r, velocity z(7) and, furthermore, verifies 
the constraints and has a value x(r), for t = 7, since y € Vx(M). 

We shall not explicitly prove the local minimum property: its (long) proof 
is entirely analogous to the proof of Proposition 37, $2.24, and should not 
present particular difficulties to the reader. mbe 


3.5.1 Problems 


1. Give an example of a holonomous constraint for a system of N point masses in R3 for 
which the only constraint compatible velocity 7 is 7 = 0. (Hint: Find a constraint y such 
that y(§) = 0 determines an isolated point.) 


2.* Let y™),..., eS) be s holonomous constraints for a system of N point masses in 
(G) 

R?. Given a constraint compatible € € RN, assume that the s vectors er) ERIN, 

j =1,...,s, are linearly independent. Show that 77 is a constraint compatible velocity if and 

only if Eq. (3.5.9) holds. (Hint: The necessity is obvious. Conversely, consider the conditions 


(i) 
on a constraint compatible motion of the form t > € + tn + je 55 OE- given by 


£ G) 
(k) tint 5. (4) Oe =0, k=1,...,8 
ge VERE D a hs Rats 
which are regarded as equations for 6; parameterized by t and solved, for t = 0 , by 6; = 0. 
We now regard the left-hand side as a function of t, 61,...,6s, and call it BY (E, 61,...,6s) 
and we try to define 6;(t), j = 1,...,s8, for t near zero, applying the implicit functions 
theorem (Appendix G). The Jacobian matrix for t = 0, 6, =... = 6s = 0, is 
a6) 3N Jy) Aylh) 
Mgn = o=) = P (E), hk=1,...,8 
Oop, OE Op 


p=1 
(k) 
which has rank s, by the supposed linear independence of the vectors wage In fact, the 


(k) 
linear independence means that, Ve € R P g1 cp 2S Æ 0 unless c = 0; therefore, if 


c #0: 


s 3N 8 (k) (h) 3N s (h) 2 
yo ecient A @=- (Loe ©) >0 
dep” Op = l ae 


h,k=1 p=1 h,k=1 p=1 
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and, since Mpk = Mkh, this means that the matrix M is positive definite; hence det M > 0 
(see Appendix F). Thus, by the implicit functions theorem, there exist s C% functions t > 

(G) 
j (t), j =1,...,8, defined near t = 0 such that the motion t > € + t+ 04_4 5; (t) 2S 


verifies the constraints and, furthermore, 6;(0) = 0 and, by the implicit functions theorem: 


$ (h) = 
ôk (t) = tM) PO -n +t? r(t) 
s=1 


for some C®™ functions ok (t) defined near t = 0. By the assumption on 7, the t-linear term 
vanishes: hence, ôx (0) = 0, i.e., x(0) = 7... .). 


3.* Given a system of N points in R3, with masses m1,...,my > 0, subject to ideal 

holonomous constraints yp), oy: yh) and to active force Fo), show the possibility of an 

explicit expression for the reaction R acting on a constraint compatible motion, at a time 

t when x(t) = £, x(t) = n. (Hint: Let mı =... = my = 1 for simplicity and suppose also 
r (i) 

that the s vectors in RƏN, ze., j = 1,...,s, are independent, again for simplicity. 


From yp“) (x(t)) = 0, j = 1,...,8, deduce by two-fold differentiation 


820) 
Ep Eq 


dy) 
0g 


N 
C OEE ORDD 


p.q=l 


(x(t))tp (t)aq(t) = 0 


and then combine this equation with the equation of motion x = F(@) + R to obtain 


82) 
OEpOEq 


dy) 


Op) 
aE Ea 


N 
E FOO 2 


p.q=l 


© RME =-f (Omma f, 


j = 1,...,s. But, by the ideality assumption, R(n, €) has to be orthogonal to every 7’ 
G) 
such that oe) - 17/ = 0 [see Problem 2 and Eq. (3.5.13)]. Hence, R has to be a linear 
apd) (£) 


combination of the s vectors a j =1,...,8, and the coefficients can be determined 


G) (G) 
by the scalar products See eh, R, j = 1,...,s8, since the s vectors oer are linearly 
independent. Deal also with the general case: different masses and linearly dependent vectors 


dp) (E) ) 
oe 


4. A “constraint” of the form y(€) > 0 for a system of N point masses in R® is called 
“unilateral”. Show that such constraints are not more general than those considered in 
Definition 7. (Hint: Let a — x(a) be a C™ function, strictly positive if a < 0 and zero if 
a > 0; then consider the constraint ~(€) = x(y(€)) = 0, etc.) 


5. Show that any velocity is compatible with a unilateral constraint y > 0 in the positions 
£, where (£) > 0. 


6. Which are the velocities 7 compatible with a unilateral constraint y(€) > 0 in a position 


E where y(€) = 0? Suppose He # 0. (Answer: Those such that n- 2e) = 0, i.e., the 


same as those for the constraint (€) = 0!) 


7. Extend the notion of velocity 7 compatible, in a configuration €, with some holonomous 
(unilateral or not) constraints Bare 5 yl) by saying that 7 is constraint compatible at € 
if there is a constraint compatible motion t — x(t), defined for t > 0 small enough (rather 
than for |t| small enough) such that x(0) = n, x(0) = £. Show that there are cases where 7 
can be constraint compatible in this new sense without being so in the old one of Definition 
7. We call the velocities which are constraint compatible in the new sense “(-++)-compatible 
velocities”. (Hint: The two notions will differ when € is such that (£) = 0 in the case of a 
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unilateral constraint y > 0. Give a physical interpretation of such extra velocities in terms 
of “collision velocities” with the constraint.) 


8.* Show that the smoothness requirements on F and F) used for the Definition 8 of 
an ideal constraint cannot generally hold if the system is subject to a unilateral constraint 
p > 0 (thought of as a constraint via the construction of Problem 4); i.e., a unilateral 
constraint cannot, in general, be ideal in the sense of Definition 8. (Hint: There are, in 
general, motions starting in the region y > 0 which in a finite time reach a point £o, 
where (£o) = 0, “collision with the constraint”, with a speed which is not (+)-constraint 
compatible in the sense of Problem 7. At this point the speed must have a discontinuity 
against the assumption that F € C% (RN) and the regularity theorem for the differential 


equations. ) 


3.6 Real and Ideal Constraints 


The discussion of §3.5 is largely unsatisfactory. 

The notion of constraints used there has been given on a purely mathe- 
matical basis and it is quite unclear which is the physical phenomenon math- 
ematically modeled by the constraints, ideal or not, of the preceding sections. 

In this and in the following sections, we will radically modify the point 
of view to show that an ideal constraint for a system of N point masses can 
also be thought of as a limiting case of suitable very strong conservative force 
fields which oblige the trajectories to lie on certain surfaces in R?@ or in their 
vicinity. 

From a physical viewpoint, one always imagines a constraint as a complex 
of forces acting on a system of point masses and due to their tendency to 
deform some obstacles. Such a tendency provokes imperceptibly small (at 
least as far as our observations are concerned®) deformations of the obstacles. 
Think of a point constrained on a rail or on a surface, or think of a rigid 
system. 

Note, also, that in the above concrete cases, the elegant theory of §3.5 is 
totally useless: the constraints now constrain in an approximate sense only 
and, therefore, they are not of the type considered there. 

The question which is more interesting for us in this context is whether 
or not the solutions of the equations obtained by minimizing the Lagrangian 
(3.5.19) on the motions constrained by s holonomous constraints yp), ..., py“) 
(see Definition 8, §3.5) provide good approximations to the real motion under 
the influence of the real constraints, which necessarily constrain only in an 
approximate sense. 

This is a really interesting problem in physics and applications, in contrast 
with the question underlying 83.5 which, abstractly, asked for a definition of 
a perfect constraint that would give rise to a sufficient condition in order 
that the equations of motion could be deduced from the least-action principle 


6 When they can be appreciated, one no longer speaks of a constraint. 
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associated with the Lagrangian (3.5.19), for the (p™,...,y("))-constrained 
motions (see §3.5, Proposition 8, p.163). 

To understand better the spirit and the meaning of the various definitions 
that will follow, it is convenient to analyze a simple but significant example. 

Consider a point with mass m > 0 in R? and suppose that it is subject to 
an elastic force with potential energy V(®) = me? (q2 +y?) and to a restoring 
conservative force toward the y = 0 axis with potential energy 

AM 3 


AW (z, y) = Y (3.6.1) 


Consider the motions under the action of the force with potential energy 


$ À 
v® +AW = Te +y’) + = y’ (3.6.2) 


It is intuitively clear that if A is very large, such a force simulates a constraint 
to the line y = 0 in a sense which has still to be precisely understood. For this 
purpose study the motions which start on the y = 0 axis and develop under 
the influence of the force with potential energy of Eq. (3.6.2). The equations 
of motion are 


më = —w’m a, my = -w° my — mày, 


3.6.3 
2(0) = zo. #(0) = vo, y(0) = 0, io = wo, pa 
and, to be definite, suppose zo > 0. 

Because of their extreme simplicity, Eqs. (3.6.3) can be elementarily solved: 
if t > (x(t), y(t)), t E€ R, denotes the solution of Eqs. (3.6.3): 


2 
J Uo z5 Wo 7 
x(t) = x? + we cos(wt + Yo); yalt) = ewan sin(wt + Yo); (3.6.4) 


with yo = arctg — Ban One then sees that the limit as A — oo of the motion 
of Eqs. (3.6.4) is the motion t > (x(t), y(t)) with 


a(t) = (x6 + ai cos(wt + yo), y(t) =0 (3.6.5) 


for all wọ. This is exactly the solution of the equations obtained by imposing 
stationarity on the motions constrained by the ideal holonomous constraint 
€ = 0 for the action with Lagrangian: 

2 
m mw 
On the basis of Observation (1) to Proposition 8 of §3.5, these equations 


coincide with those for the motions t — 2(t),t € R, in R! which make 
stationary the “constrained Lagrangian” : 
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~y m mw? 

L= gi ~ 9 & 
which, in our case, is what Eq. (3.6.6) becomes by imposing the constraint 
fo = 0. 

It is interesting to note that the more “rigid” the approximate constraint 
realized by Eq. (3.6.2), the smaller the deviations from a motion respecting 
the constraints (|y,| < Oz) for large [see Eqs. (3.6.4)]. At the same 
time, however, the coordinate y(t), which simply represents the violation of 
the constraint, oscillates more and more rapidly: in fact, its vibrations have a 
frequency: 


(3.6.7) 


ee 
y= — (3.6.8) 


These very small but very-high-frequency vibrations (“fatigue vibrations” of 
the constraint) provide a good intuitive representation of the effect of an ap- 
proximate ideal constraint on the motion of a system. In general, it is possible 
to think that a system of N point masses subject to an approximate ideal 
constraint moves as if it were on the surface X C RNI c RN4 defined by the 
constraint, with some very small elongations orthogonal to X: described by 
oscillatory motions with very small amplitude and very large frequency. 

On the basis of the above heuristic discussion, the following definition 
should appear quite natural. 


9 Definition. Given a system of N point masses, with masses m1,..., MN 
in RI, let X C RN? be a closed set and let W be a real C?(RN2) function 
vanishing on X and having there a strict minimum; i.e., X is the set of the 
points € € RN? where W(€) = 0 and for all € ¢ X it is W(E) > 0. We shall 
say that the conservative force law with potential energy 


E-AW(E)>0. ADO (3.6.9) 


is a “model of conservative approximate constraint to the region X with struc- 
ture W and rigidity A”. 

We shall denote such a force law by (X, W, A). If X is a regular surface with di- 
mension L < Nd, we shall say that the constraint model (X, W, A) is a bilateral 
approximate conservative constraint “with dimension £L” or “with codimension 
Nd-— £”, (see also Definition 10 below). 


Observation. In general X may contain interior points: in this case, one says 
that Eq. (3.6.9) is a model for a “unilateral” approximate constraint. 


It is convenient to recall the definition of a regular surface in R4. 


10 Definition. Let B — E(B) be a C® function defined on a convex open 
N C RÌ, taking its values in a neighborhood U C RÊ. Suppose that Z is 
invertible, i.e., one to one, as a map of 2 onto U and, furthermore, assume 
that E is nonsingular, i.e., that its Jacobian matrix defined by 
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_ OF; 
Op; 
has a non vanishing determinant, VB E€ 22. 
We shall say that Æ “establishes a regular system of coordinates on U” and, 
if € = E(B), we shall say that B € Q is the coordinate of € in the coordinate 
system on U associated with Æ. The just-described coordinate system will be 
denoted (U, =) and 2 will be called the “basis” of the coordinate system. 
A closed set X C RÌ will be called a “regular €-dimensional surface” if for 
all 5 E X it is possible to find a neighborhood Up and a regular system of 
coordinates (Uo, Æ) with basis Qo such that the points of X N Uo are all those 
with coordinates B = (G1,...,8¢) such that 


J(B) (8), j=l,...,d (3.6.10) 


Bi = b2 =..., Pe = 0 (3.6.11) 


We say that (Uo, =) is a local system of coordinates “adapted to X”. 
A regular s-dimensional surface is also called a regular surface of codimension 
d-s. 

Going back to the definition of approximate conservative constraint at- 
tention will be confined, from now on, to approximate bilateral conservative 
constraints with dimension £, or, as it is customary to say, with “£ degrees of 
freedom”, 0 < £ < Nd. 

Consider a system of N points in Rt, with masses m1,..., my > 0, subject 
to the action of a conservative force with potential energy V € C” (R9), 
bounded from below, and to a model of approximate conservative constraint 
(’,W, A), bilateral and ¢-dimensional. 

Suppose that X is defined by equations of the type 


ep (é) =... =~) (€) =0 (3.6.12) 


with yp € C~(RN?4) (the number s’ need not be (Nd — £), although this 
will often be the case). It is natural to study the motions t > x(t), t E€ Ri, 
developing under the action of the conservative force with potential energy 


E — VO(E) +A W(E), (3.6.13) 
following an initial datum x)(0) = o, x,(0) = no with £o compatible with 
the constraints 


Eo E X. (3.6.14) 
The question is whether there exists a limit 
x(t) = im x(t), tE R4. (3.6.15) 


Furthermore, one asks if t > x(t), t E€ R4}, coincides (when existing) with 
a motion developing under the action of the ideal constraints of Eq. (3.6.12) 


172 3 Systems with Many Degrees of Freedom 


and of the active force with potential V™, in conformity with the Definition 
8, p.160, and Proposition 8, p.163. 

It is easy to realize that there cannot be a positive answer if the problem 
is posed in the above generality. 

Just reconsider the point mass in R?, with mass m > 0, constrained to 
the line 2 = 0 with the new constraint model (X, W, A) with X = {}-axis 
and, if é& = z, £2 = y, 


W(z,y) = Typ +z’) (3.6.16) 


subject, also, to the same active force with potential energy V“ = ma? (q2 + 
y?). The equations of motion, similar to Eqs. (3.6.3), now become 


më = —mw*s— Amys, my =—-Amy(1+27)—muw’y, (3.6.17) 


with z(0) = xo, (0) = vo, y(0) = 0, y(0) = wo. These equations are more 
complex than Eqs. (3.6.3), and will be discusses only in a heuristic, non rig- 
orous way. 

Let t — (a(t), y(6)), t E Ry, be the solution of Eqs. (3.6.17). Energy 
conservation implies that if 


E=—(2+u2)+— 22, (3.6.18) 


one has, Vt > 0, 


talt)? + jale)? 


E = ANE A 
m 7 + 5 + 
(3.6.19) 
Then Eq. (3.6.19) implies 
; : 2E. 4 
liat) a Hl < (o) ; (3.6.20) 
2E vi 
< : .6.21 
lra] < (a) (3.6.21) 
2E. 4 
E y) (3.6.22) 


which follow by observing that all the addends in Eq. (3.6.19) are nonnegative. 
Fix a finite time interval [0,T] and note that the first of Eqs. (3.6.17) 
together with Eqs. (3.6.20), (3.6.21), and (3.6.22) implies that the function 
% (t) has a uniformly bounded modulus for all A > 1 and for t € [0,T]. 
Hence, if T is “small”, one can think that the function t > x(t), t € [0, T], is 
practically constant together with its first derivative and, then, heuristically 
set x(t) = zo, t € [0, T] in the second equation of Eqs. (3.6.17). Within this 
“approximation” , Eqs. (3.6.17) becomes “elementarily soluble”: 
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wo . 2 zi 

E E TAUFA sin(w" + A (1+ x6)? )t (3.6.23) 
which shows that, at least if t € [0, T],, ya (t) varies very quickly if À is large, 
with a frequency vy ~ (w? + A(1 + x2))? 

Eq. (3.6.22). 

Since, as just seen, x,(t) varies slowly (essentially À independently), we 
can substitute in the first of Eqs. (3.6.17) Ay)(t)? with its average value 
Aya (t)? between 0 and t, if t is large compared to the characteristic time Ty 
of variation of y, namely Ty ~ 27O(A~2): 


y(t) = 


remaining, nevertheless, small by 


=A w (sin(w? +A (1 + a2)3)r)? 


Ay x(t)? x= 
nH? => w? + (1 + 22) 
due [IVE C+D) (gin 0)2 d0 


eee ce 4_______— 
w? + (1+ 22) t/w? + X(1 + 22) 

where Eq. (3.6.23) has been used and 6 has been defined as 6 = T (w? +A (1+ 

x2))? = a By assumption t > Ty = 2m(w? + A(1 + 2)~2), therefore the 

integral in square brackets can be replaced by 


dr 
(3.6.24) 


1 fee 1 
jim g T (sin 0)°d0 = 5 Hence, (3.6.25) 
1 A w2 1 w2 
t ~ lllo awil .6. 
Ayat) 2w2+rA(1+22) 21422 (3:520) 


1 
if À is large enough. Then substituting A y)(t)? — Ay (t)? in the first of Eqs. 
(3.6.17), one finds 


TA (3.6.27) 


for t near 0 (but t >> Ty) and A large (note that T, — 0 as A > +00). 
For arbitrary values of t, a similar argument suggests that, in general, the 
acceleration #) should verify the equation (when À — +00): 


#= -mw 2 — Mie 

2 VIF 22/1402 

Hence, the model of constraint to the line y = 0 with the structure of Eq. 

(3.6.16) does not give rise to the motions that develop under the action of an 

ideal constraint to the line y = 0 and of an active force with potential energy 
vV% (x) = 4mw?a*, when À — +00, as one could have naively expected. 

Rather one should think that the limit motion, for A => +00, of 2) is a 

motion subject to an ideal constraint to y = 0 and to the active force whose 

potential energy is 


(3.6.28) 
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1 2 1 2 
V' (x) = sme + a Pa t = (3.6.29) 
0 


which depends on the initial velocity wo transversal to the constraint (and, of 
course, on the particular form of the structure function W). 

It is then possible to think, in general, that in the limit of infinite rigid- 
ity, the model of a conservative bilateral approximate constraint generates 
motions which respect the constraints and develop as if they were ideal, but 
under the influence of an active force modified with respect to the one with 
potential energy V‘ which naively could be thought to be the force “not due 
to the constraints”. In general, the structure W of the constraints has some 
influence, even for À large, and contributes to the active forces in a way that 
may also depend on the initial data or, better, on the “initial stresses on the 
constraints”, as in the case of the last example, where the active force de- 
pends also on the initial velocity component wo orthogonal to the constraint. 
This conjecture also sheds some light on the slightly formal distinction in §3.5 
between the active force and the reaction of the constraint. 

In the following section we will deal with questions related to the following 
problems. 


(1) Which further condition is it necessary to place on an approximate con- 
straint model (X, W, A) to imply that the motion t > x(t), t > 0, developing 
under the action of the force with potential energy 


VOLAW, (3.6.30) 


and following the initial datum 


x(0) = 0 € X, x(0) = No (3.6.31) 


is well approximated by the motion that takes place under the action of the ac- 
tive force with potential energy V“) and of the ideal constraints p™,..., pls) 
and follows the initial datum x(0) = £o, x(0) = në, where në is a suitable 
“projection of no on X”, assuming that X is determined by the equations 
pI (E) =... = p (E) = 0? (See Definitions 7 and 8, §3.5.) 

In other words the question is: when does an approximate conservative con- 
straint appear as well approximated by an ideal constraint model in the sense 
of Definition 8, §3.5, with the “naive” identification of the active forces? 


(2) If (X,W,A) is a model of an approximate conservative constraint, is it 
true that the motion developing under the action of a force with the potential 
energy of Eq. (3.6.30) and following the initial datum of Eq. (3.6.31) is well 
approximated, as A — +00, by a motion developing under the influence of the 
ideal constraints p™,...,p*) (determining X) and of a conservative active 
force with potential energy V’“, possibly different from V“) and (no, €0)- 
dependent? 
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(3) In the same situation as that of question (2) and if 7 is suitable, i.e. 
No = në, where në is as in question (1), is it true that V“ = V? [This 
seems to be true in the example, heuristically explicitly studied above, about 
the constraint to the line y = 0 generated by Eq. (3.6.16), when wo = 0; see 
Eqs. (3.6.17) and (3.6.29)]. 

Actually, we shall really study in detail question (1) only, which we shall 
refer to as the problem of the determination of “sufficient perfection conditions 
for approximate bilateral conservative constraints”. 

It will be useful and necessary to analyze in some deeper way the kine- 
matics of the system of point masses subject to constraints: this is a purely 
geometric analysis, very suggestive for its relationship with differential geom- 
etry. The following section is mainly devoted to this task. 


3.6.1 Exercises and Problems 


1. Show that the polar coordinates are a regular system of coordinates in various regions 
UCR? or UC R. 


2. Show that the surface of a sphere is a regular surface in the sense of Definition 10, p.170. 


3. Show that the surface of the paraboloid z = (x? + y?)/2 is a regular two-dimensional 
surface in the sense of Definition 10, p.170. Treat similarly the hyperboloid and ellipsoid 
cases. 


2 
4. Consider the ellipsoid surface = + + z = 1,0 <a < b< c, and show that if 


B = (b1, G2, G3) = (w, u,v) 


2-11) a 790-9 
SEWN Oe ea) 
_ (u — b)(v — b) 
Pee eee) 
zeig (u —c)(v—c) 

(RU) (a—c)(b—c) 


is a local system of regular coordinates in the vicinity of various points of the ellipsoid’s 
surface. This system is adapted to the surface itself, which has equations 6; = w = 0 
(“Jacobi’s coordinates”). The domains w = 0,u € (a,b), v € (b,c) and w = 0, u E (b,c),v € 
(a,b) give the part of the ellipsoid situated in the first octant in R3 with the exception of 
a few lines; determine the latter lines. 


5. Let V € C™(R), V'(E) = SE 4 0, VE £0, lime 4.00 V’ (E) = +00, V(0) = 0. Given a 
point (7, £) € R?, (n, €) # (0,0), define 
n? x4 (E) dé’ 
BR=—4V y T(E) =2 ———— , 
pave S e ET 


where x- (E) < x4(£) are the two roots of E — V (£) = 0. Define 
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time necessary to the motion x such that 
2T 


T(E) 


p(n, £) = & = —dV/d€ with initial datum (0, «£)) 


to reach (n, £) 


which defines y (mod 27) (as every motion with initial velocity 7 and position € is periodic 
and, sooner or later, visits e_(E) with velocity 0). 

Show that the coordinates (FE, p) are a regular system of coordinates near any point in R? 
other than the origin, and that they generalize the polar coordinates (ọ,0) with E being 
analogous to 97/2 and ¢ to 0. (Hint: First consider the case V (£) = €?/2 and explicitly find 
y(n, £). Draw the qualitative form of the curves 77/2 + V (£) = E as E varies.) 


3.7 Kinematics of Quasi-constrained Systems. 
Reformulation of Perfection Criteria for Approximate 
Conservative Constraints 


In this section R4 is regarded as a vector space in which the scalar product 
between two vectors n = (n™,...,9%) and x = (x™,...,x) is defined 
by 


N 
X min- x®, (3.7.1) 
i=1 
where m1,..., my are given positive numbers. The length of a vector is then 


N a 
Inl = (So mn Y (3.7.2) 
I= 


The strange convention above allows one to say that the kinetic energy of a 


motion t— x(t), t > to, of a system of N points, with masses m1, ..., My is 
ieee 2 
T(t) = 5l (3.7.3) 


i.e., it is one-half the square of the velocity of the point representing the system 
configuration in RN@ without explicit reference to the masses (which of course 
are now hidden in the definition of length given by Eq. (3.7.2)). 

Let (U, Æ) be a local system of regular coordinates in RN4 with basis 
N C RN? (see Definition 10, p.170) and let t — x(t), t > to, be a motion of 
a system of N point masses, with masses m1,..., my > 0, taking place for 
t € [t1, t2] inside U (i.e., x(t) € U, Vt € fti, t2]). 

We can then consider the motion t — b(t), t € [t — 1, t2], “image” in the 
basis w of the motion t > x(t), t € [t1, t2], via the coordinate system (U, Æ) 
i.e., the motion such that 


x(t) = E(b(t)), te ft, ta]. (3.7.4) 
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It is obviously possible to express the kinetic energy of the motion x in 
terms of the kinematic properties of its image motion b. In fact, if Æ = 
(2,...,2%)), differentiating Eq. (3.7.4) with respect to t 
a Nd 9g) 
OEDD gg POLO. j=1,...,N (3.7.5) 
f=1 


where x(t) = (x (t),...,x(#)), b(t) = (b1(t),..., bwa(t)), with the usual 
notations. Hence 


T(t) =) Tig 
mj 050) az) ‘ T (3.7.6) 
-5 pi ae be (BO aaa: | ())) v (t) ber (t) 


which will be written as 


3 Ge ol t))bw (t t)ber (t) (3.7.7) 


2h Ht 


having set, VBE 2, V0, 0" =1,..., Nd: 


N . p=Q) 0B) 


ge e(B) =) ma O Fa (8) = ger e(B) (87.8) 


It is convenient to establish a general definition in connection with Eqs. (3.7.7) 
and (3.7.8) because of the generality of Eq. (3.7.7) itself. 


11 Definition. The function B — g(B) defined by Eq. (8.7.8) on Q and 
with values in the Nd x Nd matrices will be called the “kinetic matrix” for 
the scalar product of Eq. (3.7.1) or, equivalently, for a system of N points 
in RI, with masses m,,...,mn > 0, “relative to the local system of regular 
coordinates (U, Æ) in RN”. 


Observations. (1) Via Eq. (3.7.7), the kinetic matrix allows one to compute 
the kinetic energy in arbitrary local coordinates; hence, its name. 

(2) Some of the properties of the kinetic matrix will be listed and discussed at 
the end of the section. For the moment, note that g(68) is a symmetric matrix 
whose elements, thought of as functions on 92, are in C% (N). 


For the study of the kinematics of quasi-constrained systems, the following 
purely geometrical definition is useful. 


12 Definition. Let X be a regular surface in RN? with codimension s and 
let U be a neighborhood of a point o € X on which a system (U, Æ) of local 
regular coordinates, with basis 2, is defined. 
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Assume that the coordinate system is “adapted” to X: (see Definition 10, 
p.170) i.e., that SOU is described in (U, Æ) by 


ĝi =...=8s=0 (3.7.9) 


(a) We shall say that (U, Æ) is “well adapted” to X if the kinetic matriz, for 
the scalar product of Eq. (3.7.1), associated with (U, Æ) has the first prin- 
cipal submatrix which is constant on the plane of Eq. (8.7.9); i.e.. if for 
B= (0,...,0, Bs41,---; Bxa) E N it is, Vee Sess 


geer (8) = ge en (0, siig ,0, Bs+1) ae: Bna) ested Let oad (3.7.10) 


where y is a s x s B-independent matriz. 
(b) We shall say that (U, 2) is “orthogonal” on X with respect to the scalar 
product of Eq. (3.7.1) if VB = (0,...,0, Bs41,---,; na): 


gek( B) = 0, €=1,2,...,8; k=s+1,...,Nd. (3.7.11) 


Observations. (1) Let t > x(t), t € R}, be a motion of N points in RI, 
with masses m1,...,Mmy > 0, which at some time Ẹ¢ happens to be in X with 
velocity x(t) “purely transversal” to X in the coordinate system (U, Æ), i.e., 
such that the motion b, image of x in 2 for t close to t, has velocity b(t) with 
components vanishing “along X”: 

bE = (18), ..., bs), 0,...,0). (3.7.12) 
If the coordinates system is well adapted, then the kinetic energy of x at time 
t depends only upon b(£) but not on the particular position b(#). 
(2) If the coordinate system (U, =) is orthogonal on X, the kinetic energy 
T(t) of a motion t > x(t) which for t =f crosses X is in this instant a sum of 
two terms: one depending only on b,(8),...,bs(£) and on b(@), and the other 
depending only on b,41(2),...,bna(@) ad on b(t): 


s+1,Nd 
as ge v (B(E)) Be Eber ©), = SO gev (O®)be Obe ®), 
L ptt L., ie!" 


(3.7.13) 

and T(t) = Tı (t) + To(é). In other words, one can say that, in such a system 
of coordinates, for t = # the kinetic energy is the sum of the kinetic energies 
of the component of motion orthogonal to X and of the component parallel 
to it. 
(3) Thinking of this, it should become geometrically evident that if X is a 
regular surface in RY? and ĉo € Y, it will always be possible to construct a 
system of local coordinates in a neighborhood U of £o which is well adapted 
and orthogonal to X (see Proposition 12 to follow). 
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The equations of motion can immediately be written in an arbitrary co- 
ordinate system, either in the absence or in the presence of constraints, using 
the following propositions. 


9 Proposition. Let V € C®(RN?) be a real-valued function bounded from be- 
low. Lett + x(t), t > to be a motion of N points, with masses m1,...,mN > 
0, in RÌ developing under the influence of the force F with potential energy 
V. Suppose that for t € [t1, ta], the motion x takes place in a neighborhood 
U C RN¢ where a local system of regular coordinates (U, =) is established 
with basis R C RN. Call b the motion in N, image of the considered motion 
x, fort € [ti, t2], via the coordinate transformation Æ. 


Then b verifies Lagrange’s equations’ associated with the Lagrangian: 
Nd 
1 = 
(a, B) > L(a, B) = 5 z v (B)ae ar — V(E(B)), (3.7.14) 
eel 


where a € RN4, B € N and g is the kinetic matrix of Eq. (3.7.1) relative to 
(U, Æ). Explicitly, such equations are, Y£ = 1,..., Nd: 


Nd = 

LE ae (b(t))bu(®)) =- was Y) : 
ssa (3.7.15) 
1 Nd Oger en A x t . t 

E D Bac PAO Beto 


PROOF. This is an exercise based on the definition of Lagrangian equations 
and on the least-action principle as in Proposition 4, §3.3. It will be left to 
the reader. 


10 Proposition. Consider N points in RI, with masses m1,..., my > 0, 
and let F be a conservative force with potential energy V™, bounded below. 
Let X C RNI be a codimension-s regular surface and suppose that X is the 
set of points € € RY? such that 


yp) (E) =... = pE) =0, (3.7.16) 


where p,..., 06 E€ C~(RN4), 
Let t > x(t) € X, t € [t1, te], be a motion developing in a neighborhood U 


T If (a, B) > L(a, B) is a real C® function defined on an open set W C R2™, we shall 


say that a C% function t — b(t), t € [t1,t2], such that (b(t), b(t)) € W, Vt € [t1, ta], 


verifies Lagrange’s equations associated with L if 
d/o. aL. 

— b(t), b(t = b(t), b(t)), p= ly 2y M 
T(Z bobo) = E60 5 


even when W does not coincide with R?™, as usually supposed so far in connection with 
the Lagrange equations. 


i 
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of o E€ X under the influence of the force F™ and of the holonomous ideal, 
constraint p),..., pls" ) in the sense of Definition 8, §3.5, p.160. Suppose that 
(U, Æ) is a local sai of regular coordinates adapted to X and with basis N2 
(see Definition 10, §3.6) and let b be the image motion of x on Q: 


b(t) = =~! (x(t)), t € [t!, ta]. (3.7.17) 


Since b respects the constraints it is such that 


bi (t) Saua S bs(t) = 0, tE [t1, to]. (3.7.18) 


Then, setting b(t) = (0,....0, B.41(t),---, Bna(t)) = (0, b(t), the motion 
t — b9 (t) verifies the Lagrange equations® associated with a Lagrangian Lo 
on RNs x NC) where N) denotes the set of points B = (Bst1,---;8Na) € 
RNI- such that (0,3) € Q (O being the origin in R). The Lagrangian Lo 
is defined by 


eee Nd 


Lo( >> ge, gr ) ae ayn — ye) (=(0, B)), (3.7.19) 
ee" 


V (a, B) E RNT x R) (0 being the origin in RE). Hence, the equations of 
motion for BOW are, V2=s—1,...,Nd 


LSS aa 0, pb) (t be it t)) = = a) 


TA obe B=b) (t) 
a poa (3.7.20) 
1 Ogee (0, b 
ta D OT O AO A A), 
v e= 


PRooF. Proposition 10 can be proved as a corollary to Proposition 8, §3.5, 
p.160, and it will be left to the reader as an important exercise. 


Through the Propositions 9 and 10 and the above definitions, it is now 
possible to reformulate and make precise the Problem (1) posed at the end of 
§3.6, p.174. It appears to be naturally related to the following definition. 


13 Definition. Let (X, W, A) be a model for a bilateral conservative constraint 
for a system of N points in R, with masses mı,..., my. Suppose that X has 
codimension s and that it is described by Eq. (8.7.16). 

Let V be a real-valued C°(RN?) function bounded below and let t > 
x(t), t E R4, be the motion of the system developing under the influence 
of the force with potential energy 


8 See footnote 7. 
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E> V(E) = VO (€) + AWE) (3.7.21) 


following the initial datum 


x(0)=& ED, — x(0)= n ERP’. (3.7.22) 


We shall say that (X, W, A) is a model of an “ideal approximate constraint” 
if, YVV® £o, no as above: 
(i) The following limit exists: 


im xa (t) = x(t), Vt>0 (3.7.23) 
(ii) The function t > x(t) is a motion developing on X: under the influence 
of the active force F, with potential energy V™, and of the ideal constraint 
py ,...,p? to X, in the sense of Definition 8, §3.5. 
(itt) The initial datum verified by x is 


x(0)= £o,  *(0) = në, (3.7.24) 


where no is the orthogonal projection of no on the tangent plane to X in £o, 
with respect to the scalar product of Eq. (3.7.1) [see Observation (1) below]. 


Observations. (1) Let (U, Æ) be a local system of regular coordinates with 
basis 2. Let £o € U and suppose that (U, Æ) is adapted and orthogonal on 
X: it will be, by Eq. (3.7.5), 


Nd = 
m=) 7g, (80) a? (3.7.25) 
w=1 


= 


where Bo are the coordinates of £o in (U, Æ) and a? € RN? is a suitable 
vector. Then the projection në is, by definition, 


ny = >) ==(G0) a? (3.7.26) 


It could be checked that në does not depend on the coordinate system (U, £) 
provided the latter is adapted and orthogonal on X. 

(2) By the above definition, if (X, W, A) is a model of an approximate ideal 
constraint, the motions of the system starting on X and developing under 
the influence of a conservative force with potential energy V(®) + AW are, for 
large à, well approximated by the motions of the same system subject to s’ 
ideal constraints y,..., pls), determining X, and to an active force with 
potential energy V™, in the sense of Definition 8,§3.5, p.160. Of course, it 
would be desirable, and necessary in order to make quantitative statements, 
to have estimates of the difference between x(t) and x)(t) in terms of A: this 
will be done, in some cases, in§3.8. 
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(3) Problem 1, p.174, §3.6, is equivalent to the following question: given 
(X,W, A), how does one recognize whether this is a model of an approxi- 
mate ideal constraint? 

(4) Assume that for t € [t1,t2], the motion t — x,(t) takes place for large 
enough A in a neighborhood U such that U N X # Ú and suppose that a 
local system of regular coordinates is established on U: (U, Æ) with basis 2 
adapted to X. Let t > b, (t), t € [t1, t2] be the image in (2 of x, considered 
for t € [t1, t2]. Items (i) and (ii) of Definition 13 are equivalent to: 

(i1) There is a limit 


Jim ba(t) = b(t) = (0,b®)(t)), te [ta,2]. (3.7.27) 
where 0 is the origin in R5 and b‘*)(t) is a suitable RN *-valued function. 
(i1) t— b(t) is a C% ([t1,t2]) function verifying Eqs. (3.7.18) and (3.7.20), 
Vt € fti, ta]. 

Condition (iii) is equivalent to: 
(iii,) If o € U and (U, Æ) is orthogonal on X: 


=(b(0)) =&, 6(0)=0, i=1,2,...,5. (3.7.28) 


In the next section we shall discuss an important sufficient perfection criterion 
for approximate conservative constraints and the perfection will checked in the 
form of (i1), (i1), and (iii, ) above. 


We conclude this section by stating some simple properties of the kinetic 
matrices on R4 associated with the scalar product of Eq. (3.7.1) in a local 
system of regular coordinates. We shall also sketch the proof of some met- 
ric properties of the regular surfaces (i.e., the existence of well-adapted and 
orthogonal coordinates). 


11 Proposition. The matrix g(@B) defined in Eq. (8.7.6) on Q is, YB E€ 
N, symmetric and positive definite. The matrix elements of g(B), as well as 
the matrix elements of the matrices inverting g(B) or any of its principal 
submatrices, are in C®(N) as functions of B € R. 

There exists a positive continuous function B — C(B), defined on N, such 
that if u(B) is a q x q principal submatrix of g(B) or an inverse to such a 
matriz, then Yo = (01,...,0q) E RA: 


q lq q 
C(B) S00? < So were (B)owoe < CB) Y o? (3.7.29) 
i=1 Le" i=1 


Observation. We recall that a q x q matrix yp is called positive definite if it is 
symmetric and 
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1,q 
5 He gr Og Ogn > 0, Vo € RI, (on Æ 0 (3.7.30) 
et et 


(see Appendix F). 


ProoF. The symmetry is obvious and has already been remarked on [see Eq. 
(3.7.8)]. The positivity of g() follows from its kinematic interpretation: in 
fact, Si ge er (B)ov oe is the kinetic energy of a motion which at some 


time happens to be in Æ(8) with velocity x = pay os” oe, j=1,...,N 


[see Eq. (3.7.5)]. 

If o Æ 0, then x Æ O because the coordinate system is regular [see Eq. 
(3.6.10)]. Hence, Sip" ge er (B)ov oe = Y7 m (x)? > 0. 

From algebra, it is known that a matrix g(3) is positive definite if and 
only if all its principal submatrices are positive definite: in such case also 
their inverse matrices are all positive definite and all the mentioned matrices 
have a positive determinant. Furthermore, if u is a q x q positive definite 
matrix, there is a positive continuous function of its matrix elements such 
that 


? 


q lq q 
Cru) Yo? < SO pe wovon <l Yoo? (3.7.31) 
i=l ete" i=l 


(see, also, Appendix F, Corollary 3 and related exercises). 

Then the proposition is a consequence of these algebraic properties and 
of the observation that all the mentioned matrix elements are in C% (N): in 
fact they are obtained by taking products and sums of matrix elements of 
g(B) and possibly dividing the results by products of determinants of some 
principal submatrices of the matrix g(@), which are in turn positive by what 
has just been mentioned. mbe 


The following proposition concerns the existence of a local system of reg- 
ular coordinates (U, =) well adapted and orthogonal to a regular surface X 
in the vicinity of one of its points o. 


12 Proposition. Given a regular surface X C RN? with codimension s and 
given the scalar product of Eq. (8.7.1) and £o E€ X, it is always possible to 
find a neighborhood U of £o on which it is possible to define a local regular 
system of coordinates well adapted and orthogonal to X. It is even possible to 
construct it so that the kinetic matrix is, in the basis points corresponding to 


XNU, 


gee = Yoee, 1 Aes ay y>0 (3.7.32) 
(“Fermi coordinates” on X). 


PROOF. Only a sketch will be given, leaving to the reader the task of com- 
pleting the proof (see, also, exercises and problems at the end of this section). 
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Figure3.1 


Fig.3.1. € has coordinates (8”, 6+) and 8” = {abscissa of €’ on X}. 


A first reading should be intended just in order to get some ideas about the 
proof and the geometrical meaning of Proposition 12: completing the details 
should then become easier and transparent. 

Let £o € X and let U be a bounded neighborhood on which a local system 
of regular coordinates adapted to X, (U, Æ), is established. Suppose that 
=~1(€)) = 0, say. An orthogonal and well-adapted system (U, =) will be 
built by suitably choosing U CU. The system is geometrically illustrated in 
the case d = 2, s = 1, in Fig.3.1. 

The construction proceeds as follows: at every point €’ € X NU, consider 
a hyperplane in RN? orthogonal to X in €’ in the sense of the orthogonality 
associated with the scalar product of Eq. (3.7.1). Denote this hyperplane by 
m(€') (dotted lines in Fig. 3.1). Fix on 7(&’) a system of Cartesian mutually 
orthogonal axes with unit vectors e1(&’),...,e,(€’), the orthogonality being 
in the sense of the scalar product of Eq. (3.7.1) and the length of the axes 
being measured in the same sense. 

Choose the above unit vectors so that the points €’ + e;(€’), i =1,...,8, 
have coordinates which are C% functions of the Nd—s nontrivial coordinates 
of €’ in (U, X); i.e, choose the Cartesian axes “so that they are C% functions 
of 6’€ VNU”. 

There is a neighborhood U’ of £o, U’ C U such that every point € € U’ is 
on a unique plane 7(€’) with €’ suitably chosen on X N U (“consequence of 
the finite curvature of X”). 

To every € € U’, we then associate Nd coordinates B: the first s of them, 
denoted B+ = ((1,...,8s), are the coordinates of € in the Cartesian frame 
chosen on the plane 7(&’) containing £; the remaining Nd — s coordinates 
pl = (B41, zal , BNa) are the coordinates with Bi = ĝiii = s+ 1,..., Nd, if 
&' has in (U, Æ) coordinates (0,...,0, 8s41,---, Gna). 
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Setting € = 2 (8), one can check that as € varies in U’, the point B varies 
in some open set N’ € RN?. Furthermore, the function Ē is a C®-invertible 
map of 92’ onto U’ which is not singular, i.e., its Jacobian matrix [see Eq. 
(3.6.10)] has a never-vanishing determinant if U’ is small enough. Let SN4 be 
a sphere contained in 9’ centered at Z~!(&)) = 0 and set U = E(gN4). Then, 
essentially by construction, the pair (U A = ) is a coordinate system which is of 
Fermi type on X with basis SN4 and y=1. 

The difficulty, in a rigorous proof, lies in the justification of the actual 
possibility of the various “choices” involved in the above descriptive argument, 
and in checking the validity of the statements claimed about the uniqueness of 
the plane 7(€’) through €’ and on the non singularity of the Jacobian matrix 


I(B) = 559): Be gre, (3.7.33) 


The main idea is to use the implicit function theorem to check the above 
properties, (see the following problems for some more details.) 


3.7.1 Exercises and Problems 


1. Establish an orthogonal regular system of coordinates well adapted to the circle Fr C R?, 
with radius 1 and center at the origin, with respect to the scalar product n-x = 1x1 +N2X2, 
in the neighborhood of a generic point £o € I’. 


2. Same as Problem 1, replacing with the parabola y = x?, the hyperbola xy = 1, or the 
ellipse x?2/a? + y?/b? = 1,a,b > 0. 


3.* Let I C R? be a simple C% curve in R? parameterized in terms of its curvilinear 
abscissa s € R as: 


&1 =X1(s), (Xi (s))? + (X4(s))? =1, 


2 =X2(s), lim (X1(s))? + (X2(s))? = +00, 

s— too 

where Xj, X} are the derivatives of X1, X2. For every point on I’ with abscissa s € R, 
consider the normal line n(s) with equations €;Xj(s) + €2X5(s) = 0. Show that given 
R > 0, there is 6 > 0 such that the segments of length 26 cut around (X1(s), X2(s)) on the 
line n(s) are pairwise disjoint, VO < |s| < R. (Hint: Define for |s| < R, |o| < ô: 


F,(s,0) = Xi(s) + o X} (s), F2(s,0) = X2(s) — o X1 (s) 


and note that the equations F3 (s, 0o) = Fi (5,7), F2(s,o) = F2 (5,7) thought of as equations 
for (s, o) parameterized by 3,7 have s = 3,0 =@ as a unique solution near 5,4, if F is small 
(using the implicit function theorem). Then, by using the possibility of choosing ô small, 
show that they have a unique solution in the entire region |s| < R,|o| < 6, etc.) 


4.* In the context of Problem 3, show that there is 6’ < 6 such that the image via (F1, F2) 
of (—R, R) x (—6’, 6’) is a neighborhood U of (F: (0,0), F2(0,0)) € I. where the map 
(s,o)— (Fi (s, 0), Fo(s,)) is invertible, C° and with nonvanishing Jacobian, i.e., the map 
(U, F) is an adapted system of local regular coordinates. 
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5.* In the context of Problem 4, compute the kinetic matrix g(s,a) and show that (U, F) 
is a well-adapted orthogonal coordinate system for I’, with respect to the scalar product in 
Problem 1, and g is on I’ a 2 x 2 diagonal matrix. 


6.* Let z = s + iø, (s,o) € R?, and let f be a function on C admitting a representation 
f(2) = Ro cnz” with cn m= 0 so fast that the series has an infinite radius of conver- 
gence. Also suppose that f’(z0) = Xpo nenzy + # 0 for some zo € C. 

Let 71,72 be two segments of regular curves crossing at zo E€ C and there forming an an- 
gle yo. Show that the f images of y1, and y2, f(71) and f(y2), cross at f(z) forming 
the same angle yo. (Hint: Let {dz1}, and {dz2} be two infinitesimal segments in zo di- 
rected along 7, and y2. Show that f({dz1}) = f’(zo){dz1}, f({dz2}) = f’(zo) {dz}, where 
f' (0) = Sr nenzy | 0, and interpret this as saying that dz1, and dz2 are transformed 
into two infinitesimal segments emerging from f(z), elongated by a factor oo = |f’(zo)|, 
and rotated by an angle ĝo = argf’(zo) (“conformal mapping property” ).) 


7.* In the context of Problem 6, suppose that z — f(z) is one to one near zo and that 
f'(z0) # 0. Call U a neighborhood of zo where this happens. Show that the map (s, øo) € 
U = (s',0') = (Re f(z),Im f(z)) (i.e, z — f(z)) maps U onto a neighborhood V of 
(Re f(zo),Zm f(zo)) and establishes a local system of regular coordinates on V. 

Let U be a disk around zo and zo = 0. Consider the curve in V whose equations 
are s’ = Re f(s),o’ = Im f(s) for (s,0) € U. Show that the above coordinate system is 
orthogonal on the curve I’ image of the points in U of the form (s,0). 


8.* Without use of complex functions, extend the argument of Problem 3 to a regular 
surface X in Rİ with X and d arbitrary, using the ordinary scalar product in RÊ. (Hint: 
Follow the pattern of the sketch of the proof of Proposition 12 and of the Problem 3, using 


the implicit function theorem.) 


3.8 A Perfection Criterion for Approximate Constraints 


This section is devoted to the analysis of the following interesting proposition, 
“Arnold’s theorem”, see historical note on p.211. 


13 Proposition. Consider N points, with masses m,,...,mn > 0, and a 
model of bilateral conservative approximate constraint (X, W, A) with codimen- 
sion s. Suppose that VE € X, there is a neighborhood U admitting a system 


of local regular coordinates (U, Æ) with basis Q, well adapted and orthogonal 
on X with respect to the scalar product of Eq. (8.7.1), and such that 


where B = ((1,..., Bs, Gs4+1,---,8na) and W is a real C® (R5) function, van- 


ishing at the origin and having a strict minimum there. 
Then the constraint model (X, W, A) is an ideal approximate constraint, in the 
sense of Definition 13, p.180. 


Observations. 

(1) We already noted that it is always possible to find a neighborhood U of £o 
on which a local system of regular coordinates, well adapted and orthogonal 
on X, can be established. In general, however, the functions 86 — W(5(8)) 
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will depend on all the Nd coordinates of 3 € N and not just on the first s (one 
can say that W will not, in general, be “purely orthogonal” to the constraint). 
(2) Before proceeding to the proof, let us discuss the following example. 


Example. Consider a two points system, with masses m1, M2 > 0, in R? 
and the constraint model (“rigid link at distance 2’) defined by 


X = {E E = (€%,€), [ED -E| = 8 (3.8.2) 
2 


we, E2) E (ie = E2) = e) (3.8.3) 


where £ > 0 is given. Let us check that the approximate conservative con- 
straint model (X, W, A) verifies the assumptions of Proposition 13. Define the 
following Baricentric-Polar coordinates: 


bi =|EP -E| -L1=0-l, b=0, B=, 
Ba =(€a)1, ps = (£a)2, Be = (a) 


where (0,6, p) are the polar coordinates of the vector €) — €@) and £p is the 
vector determining the baricenter position: 


(3.8.4) 


a) (2) 
gg = e + mg (3.8.5) 


my T m2 


Through Eq. (3.8.4), one can easily establish a regular local system of coor- 
dinates (U, Æ) in the vicinity of any point £o € X such that 6 € (0,7), y € 
(0, 277) (which, by the arbitrariness of the choice of Cartesian axes, is not a real 
restriction). This reference system is adapted to X, and X is given by 3; = 0. 
To check that it is well adapted and orthogonal, for the scalar product of Eq. 
(3.7.1), compute the kinetic matrix remarking that 


2m1M2 


my (E)? + m2(é%)? = (mı + ma) ÈZ x (E® E E2) (3.8.6) 


mı + mə 
which follows immediately from the relations 
(ij 2 (eD _¢@ BD _ ¢, — (EDE), (3.8.7 
6 = bot r (6-2), 6 a (6), (8.8.7) 
by differentiation and some algebra. Since 
(EO — EO)? = P+ PH + (sino P, (3.8.8) 


which can be seen by recalling that the line element in polar coordinate is 
do? + o°d0? + 07(sin @)?dy? and that (0,0, p) are just the polar coordinates 
of €() — €(), it follows that Eqs. (3.8.6) and (3.8.8) give 
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mi (€)? + mo(E))? = (my + m2) (67 + 6? + 68) 
A s Z 3.8.9 
4 2M? (82 4 (e+ 61)? 63 + (E+ Br)? (sin 62)? 3), men 


my, + mə 


showing that the coordinates of Eq. (3.8.4) are well adapted and orthogonal 
on X (in fact, gi1(@) = 742 and the quadratic form of Eq. (3.8.9) does 


mi+me2 
not contain the mixed terms (3; Be, £ > 1). 
In this coordinate system, the constraint structure function W of Eq. 
(3.8.3) is simply (G7 + 2031), i.e., it depends only on (1. 
A further example is the model (X, W, A) for a single point in R? bound 
to a regular surface o: X = {€|€ € o}, and W is a C™ function of €, positive 
outside X and having, for € close enough to X, the form 


W(€) = (n- (€-€))? (3.8.10) 


where é is the point on ø closest to € and n is a unit vector normal to ø in é. 
(The proof is left as a problem.) 


PROOF (OF PROPOSITION 13). Let (no, o) be an initial datum for the given 
system of point masses, with o € X. 

Fix \ > 1 and a function V € C®(RN1) bounded from below. Let U 
a neighborhood of the point o where it is possible to define a local system 
of regular coordinates (U, Æ) with basis 2, well adapted and orthogonal on 
X and such that Eq. (3.8.1) holds in this system. Suppose 2~!(€ 9) = 0. We 
also suppose, for the sake of simplicity, that W has a rather special form: 


In spite of the particularity of Eq. TRT this is an assumption that can 
be eliminated through some formal complications which would only make the 
true difficulties of the problem and the solution method more obscure (see 
problems at the end of this section). 

Denote t > x(t), t € R4, the motion that the N points perform under 
the influence of the force with potential energy V“ + AW starting from the 
initial datum (70, o). By energy conservation it follows that 


D %)(t))? + V (x, (t)) + AW (x, (t)) = E (3.8.12) 


is a constant in t and 


B=) (ah)? +V ) (3.8.13) 


is A independent because £o E€ X and W vanishes on X. 
Then Eq. (3.8.12) and the assumed boundedness of V“) imply that 
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N 
y Ga 2 < E+ sup(-V(€)) E C < +o. (3.8.14) 


= 


If S, denotes a closed ball with radius ọ and center o, contained in U, Eq. 
(3.8.14) will imply that the motion t — x,(t) will develop remaining inside 
So for t € [0, T], i.e., xa (t) € So, Vt € [0, T], if T is chosen so that 


T pee <o (3.8.15) 
minj Mj 
Fix T verifying Eq. (3.8.15) consider the motion t > x, (t), t € [0, T]. 

Existence of the limit lim,.4.. x,(t) = x(t) and validity of Eqs. (3.7.18) 
and (3.7.20) will be shown only for t € [0,7]. The treatment of the general 
case (t arbitrarily large) contains some additional difficulties and will not be 
discussed in detail. Such difficulties have a geometrical character and depend 
on the fact that (U, Æ) is generally a local coordinate system and not a global 
one for all of X, see Problem 5 at the end of this section. 

First it will be shown that the motion t > x, (t), t € [0, T], tends to evolve 
on A as A > +00. 

This is a simple consequence of energy conservation and of the positivity 
(only) of W: it does not depend on the special hypothesis on the constraint 
nature [Eq. (3.8.11)], but it would be valid even for general approximate con- 
servative constraints. 

Let t — b(t), t € [0, T], be the image of the motion x), observed for t € 
[0, T], in the basis Q of the coordinate system: b(t) = 2~+(x)(t)), t € [0, T]. 
Rewrite the energy conservation equation in the local coordinates (U, Æ) by 
using the kinetic matrix gg w in this reference system: 


iS gu en (By (t))by v (tba ev (t) + VO (E(by(t De =F, 
2 y „"=1 
(3.8.16) 
having used Eq. (3.8.11) to express W. 
The first of the above three addends is non-negative (being the kinetic 
energy; see, also, Proposition 11,§3.7, p.182). Hence, Eq. (3.8.16) implies 


2C 
Past < (FS 
if C is defined by Eq. (3.8.14). 
From the examples discussed in $3.6, it is expected that the motion t —> 
b(t), although squeezed on X, will very quickly oscillate transversally to X. 
It will therefore be useful to estimate the velocities bilt), i= 1,...,s, of the 
“vanishing coordinates”. Equation (3.8.16) also provides such estimates: by 


EA 
yee. J= s: (3.8.17) 
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Proposition 11, p.182, we can say that there is a constant g7! = {minimum 
of C(@) in Eq. (3.7.29) for E(B) € So} for which 
1 Nd 
Ei 
39 D(a p> Ae w (balt) ba v (ba e (t), (3.8.18) 


Vt € [0, T], because for such t’s and by the choice of T, x(t) = Æ (bx (t)) € So. 
Then Eqs. (3.8.18) and (3.8.16) imply, VA > 0, 
[ba e(t) < 209,  £=1,..., Nd, t€ [0,7]. (3.8.19) 


By the orthogonality and adaptation properties of the coordinate system 
def 


(U, =), setting 8 z= (Bv, Bn) with By — (Biss any Bs) E RS, Bn = (Bayiye 
Bna) € RNT it is 
ge er (0, Bn) = 0, @=1,...,8;0" =s41,...Nd (3.8.20) 


(orthogonality) and 


gew (0, Bn) = YL, £, 4 7 1, 1125S; (3.8.21) 


(good adaptation), where y is a constant s x s matrix. Since the functions 
ge er (B), B E€ Q are C™ functions, the Taylor-Lagrange theorem (see Ap- 
pendix B), V (Bv, Bn) E 2, YL = 1,...,s, Z =s+1,...,Nd, yields 


gev (Bv, Bn) = XO gev j(Bo, Bn)B; (3.8.22) 
j=1 
and Y, 7 = 1,...,8 
gee (Bo, Bn) = Yee + XO 900.5 (Bvs Bn) Gy (3.8.23) 
j=l 


where gee;(B), B € Q, are suitable C™ (2) functions. 

Equations (3.8.22) and (3.8.23) can be used to write “more explicitly” the 
equations of motion (3.7.15) for the “non constrained coordinates”, i.e., for the 
8;’s with j = s+1,..., Nd. Using Eq. (3.8.1), one finds, for £ = s+1,....Nd: 


Nd 
HE E rO Obe] E Diae) 
@=1j=1 L'=s+1 


daa a) 
-5 ee gew Ealt) ga (alo) 
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’ (by (t))by v (Eba e (t) 


L'=1 l'=s+1 j=l 


Z ge, Lj 
OBe 


alt 
t3 
he D D Petu 1 (by(t))ba (tb, v (t)by e(t) (3.8.24) 
ee 
+9 


“(by (t))ba j (t)ba e (tba e (t) 
ee =1 j=l 


where the square brackets isolate the terms which should vanish, as À — 
+oo, in order that Eq. (3.8.24) could reduce at least formally to Eq. (3.7.20) 
as wished on the basis of Definition 13,§3.7, p.180 and Observation (4) to 
Definition 13 p.182. 

Note that in Eq. (3.8.24) every term in square brackets contains factors 
proportional to one of the first s coordinates which, by Eq. (3.8.17), van- 
ish as AX — +00 uniformly in t € [0,7]. The coefficients in Eq. (3.8.24) 
of such coordinates are uniformly bounded in A, as the motion takes place 
in So, for t € [0,7], and there the g... are C% functions and, therefore, 
bounded together with their derivatives; furthermore, Eq. (3.8.19) provides 
)-independent bounds for by ¢(t). 

To understand in a rigorous way that the above formal convergence of Eqs. 
(3.8.17) and (3.8.24) to Eqs. (3.7.8) and (3.7.20) implies that, uniformly in 
t € [0, T], the functions t > by ¢(t), €=s+1,..., Nd, converge to limits b(t) 
verifying Eq. (3.7.20) with the desired initial conditions (3.7.24), some more 
work is still necessary. 

Integrate both sides of Eq. (3.8.20) with respect to t, V@=s+1,...,Nd: 
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Nd 
XO gee (alt) ba or(t )+ | 5i gew ja lE) ba 5 (t)ba e (t) 
V=s+1 v j= 
Nd 
— Š gee(b(0)) be (0) 
Ll'=s+1 
t N = 
av) ae) 
= — Z(ba(ť t 
{E E EOD ar 
T (3.8.25) 
j- 5 — (ba(t’) bn o(t')bn en (e’)} dt! 
OL" =s+1 Be 


nae w{d 5 3 Dai te ej SED ba elt ba e (t)by w(t") 


j=1 l" =s+1 j=1 


+3 D S Beets on ee 
Ce =1 j=l 


where in the second line the hypothesis that b) ¢(0) = 0, if @=1,...,s, is used 
together with the A-independence of the initial data by ¢, bye, €=1,...,Nd. 

Now bring the second and third addends to the right-hand side and con- 
sider the resulting equations as (Nd — s) linear equations in the (Nd — s) 
unknowns 0) ¢(t), 2 = s+1,...,Nd, pretending that the right-hand side is 
known. The matrix of the coefficients is the last (Nd — s) x (Nd — s) principal 
submatrix gs, of the kinetic matrix g: (gs)i7 = gi;(by(t)), i,j = s+1,..., Nd. 
By Proposition 11,§3.7, p.182, gs admits an inverse matrix d,(b)(t))~ (mak- 
ing explicit its b,(t) dependence). Therefore, by ¢(t), £ = s +1,..., Nd, can 
be expressed in terms of the right-hand side. Thus: 


= )=[- Si g; (by (t DaS S we (by (t )) by 5(t)by elt) 


l=s+1 a1 j=l 
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Nd Nd : 
+ O Gs Ee X geer(b(0))b(0)+ 
l=s+1 é’=s+1 
Nd 


+ YO GPa) ze 


f=s+1 


t N AV (E(b (t’))) JE) (b (t’)) 
(egg erry 


0 k=1 
ak Oger er (by (t’)) 


a7 ba e (tbx er ©) } dt! (3.8.26) 


LL =s+1 
Nd 
+| 2 GOD. 


€=s+1 


t s Nd s 
ge en ba(t 3 ; 
| ones OD D inobat) 
0 ` p= =s+1 j=1 £ 
1 Š Â bgo pi lba ; ; 
EA So A p inobat} a] 
X l''=1 j=l 


It has, now, to be remarked that the terms in square brackets vanish uni- 
formly in t € [0,T] as A > +00 because of Eqs. (3.8.17) and (3.8.19) and 
because of the uniform boundedness in Æ71(S,) of the g functions and of 
their derivatives. 

Furthermore, convergence to a limit, as A — +00 of the terms which are 
not in square brackets in Eq. (3.8.26) follows: call ô, z(t), t € [0, T] their sum 
and show, first, that a subsequence Àn — +00, extracted from an arbitrary di- 
verging sequence, exists such that ô, z(t) converges to a limit ôz(t) uniformly 
in t € [0,T],V@=s+1,...,Nd. 

This will be shown by proving that the family of functions on [0, T] pa- 
rameterized by À and £: (5\,0)>1, @=s+1,...,Nd iS an equicontinuous and equi- 
bounded family of functions on [0,7], and then applying the Ascoli-Arzela 
theorem (see Appendix H). 

Finally, the actual existence of the limit as A — +00 of 6) e(t) will be 
obtained by showing that every limit of the converging subsequences verifies 
a certain differential equation with given initial conditions, whatever the sub- 
sequence is, and applying the uniqueness theorem for differential equations: 
the equation will essentially turn out to coincide with Eq. (3.7.20) and the 
proof will then be complete. 

Equiboundedness (see Appendix H) of the functions is clear from Eqs. 
(3.8.17) and (3.8.19). Equicontinuity of the contribution to 6, z coming from 
the integral of Eq. (3.8.26) and that coming from the part outside the integral 
can be separately shown. They follow from the remarks: 

(i) Consider a family (Ha)aca of functions on [0, T] given by 
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lalt) -f ValT) dt (3.8.27) 


where (Va)aca is a family of equibounded continuous functions on [0, T], i.e., 
a family of functions bounded as |va(t)| < B, Vt € [0, T], with a suitable B, 
Va € A. Then the family (Wa)aea is equicontinuous; in fact, 


|ualt) — Ha (t’)| = i valT)dr| < B\t—t. (3.8.28) 


(ii) Families of functions obtained by composing a given C® (R*”) function and 
a family of equicontinuous equibounded R’-valued functions on [0, T] form 
equicontinuous equibounded families of functions Vh > 0 (exercise). 

By suitably combining the criteria (i) and (ii), Eqs. (3.8.17), and (3.8.19), 
the fact that (g~1)eo are C© functions on 2~1(S,) and t > ba(t) is an 
equicontinuous family [by (i) and by Eq. (3.8.19)] one realizes that ô, z form 
an equicontinuous equibounded family of functions on [0, T] parameterized by 
AS 1,284 1,...,Nd. 

Then the Ascoli-Arzela criterion (see Appendix H) states that from every 
diverging sequence of positive numbers, it is possible to extract a diverging 
subsequence (An )nez, such that the limit 


exists uniformly for t € [0,T],J=s+1,...,Nd. 

Equation (3.8.26) then implies (since it has already been observed that 
the terms in square brackets in the right-hand side vanish uniformly as À —> 
+oo, Vt € [0, T]) that 


lim 4, z(t) = d7(t). Hence, (3.8.30) 


: . (3.8.31) 
= (0) + f d5(r) dr I y(t), £=s+1,...,Nd, 
0 


uniformly in t € [0,7], because the initial datum is \ independent, and bz(t) 
is defined by the last identity. Of course, by changing the subsequence An we 
cannot yet be sure that 67 and bz, thus defined, do not change. 

The functions t — bz(t), t € [0,7], defined in Eq. (3.8.31) are, by Eq. 
(3.8.31) itself, once differentiable and 


b(t) = &lt),  Wte [0,7], Vi=s+1,...,Nad. (3.8.32) 


Coming back to Eq. (3.8.25) with A = A, and using Eqs. (3.8.17), (3.8.31), 
(3.8.30), and (3.8.32), we find that as n > œ, YVL = s + 1,..., Nd: 
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Nd Na 
S geer(b(t))be(t) = Y gee(b(0))be (0) 
L'=s+1 L'=s+1 
te Aava on aa, 
tf { = >, aE (E (b(t))) - aa E )) (3.8.33) 
= OE (b(t) (ebet) } dt’, 


having set b(t) = (0,...,0,6s41(t),.--,0nwa(t)) [recall that bg(t) is defined 
by Eq. (3.8.31) only for £ = s + 1,..., Nd]. Therefore t — b(t) verifies the 
wanted initial conditions at t = 0, Eq. (3.7.24), as well as Eq. (3.7.18) and, 
by differentiating Eq. (3.8.33), also Eq. (3.7.20). It is also true that b € 
C™((0,7]). In fact, pretending that the right-hand side of Eq. (3.8.33) is 
known, we interpret Eq. (3.8.33) as a linear system in the unknowns b(t): its 
coefficients form the already-met nonsingular matrix gs(b(t)). Proceeding as 
in Eq. (3.8.26), we find 


Nd Nd 
W= E OOOD, YZ geer(b0))be 0) 
łl=s+1 Ll'=s+1 
F IE) 
+ [¢ + EH) Fz) (3.8.34) 
S DIEE Oe (beya), 
2 vb =s+1 pe 


and since the right-hand side is obviously once differentiable, it follows that 
be; is twice differentiable. Differentiating both sides, we find an expression for 
by in terms of b, b, and some integrals: hence, b is differentiable, etc. So b is 
in C™((0, T]). 

It remains to show that the limit as A —> +00 of b,7(t) exists, V2 = 


s+1,..., Nd, without “passing to subsequences” . It suffices to show that every 
divergent subsequence An — +00 for which the limit lim„—-+o b, z(t), 2= 
s+1,...,Nd, exists uniformly has to converge to the same limit. 


It a aah to show that there is only one function t > b(t) verifying Eq. 
(3.8.33) and in C% ([0, T]) and such that b(t) € 2~1(S,), Vt € [0, T], because 
every limit of a uniformly convergent subsequence has to verify Eq. (3.8.33). 
The following trick, which will be sublimated in §3.11 and§3.12, can be used. 

Set, VL = s+ 1,....Nd: 


Nd 


> gee (b(t))be (t) (3.8.35) 


Ll'=s+1 


def 
pelt) = 
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or, for short, 


p = gs(b)b, (3.8.36) 
where g,(@) is the last principal matrix of g(8) of order Nd — s. Then, by 
differentiation with respect to t, Eq. (3.8.33) yields the following equations, 
Vé=s+1,...,Nd: 


N 

av) ae) 
Pe = D ky i 

£ agh) OBe 
Nd 

1 oge en (3.8.37) 

as r- (b) (gs(b) p)e (gs(b) p)er. 
east é 


TE E 


with the notations of Eq. (3.8.33), having dropped the t dependence from 
p(t), b(t) and having deduced the second equation from Eq. (3.8.36). In Eq. 
(3.8.37), b means (0,...,0,bs41,---, Ona). 

Note that pe(0), be(0), 2 = s+ 1,...,Nd, are, by Eq. (3.8.36) or by as- 
sumption, independent of the sequence used to construct b. 

So every sequence Àn > +00, for which b), e(t) is uniformly convergent in 
t € [0,7] to a limit, Y£ = s + 1,..., Nd, can be used to construct a solution 
of the differential equation (3.8.37) for t —> (pe(t), be(t))e=s+1,..., Na verifying 
the initial condition (pe(0), be(0))e—s44,...,Nd- 

Eq. (3.8.37) is not quite a differential equation of the type considered 
in the uniqueness theorem, Proposition 1, §2.2, p.14, since the right-hand 
side of Eq. (3.8.37) is defined only for b € &~!(S,) as a function of the 
pe,’s, be’s. However, all functions b(t), t € [0, T], which can be built via the 
above construction, are such that b(t) € 2~1(S,), t € [0,7]. Then easy from 
Proposition 1, p.14, as a corollary, it follows that every solution to Eq. (3.8.37) 
t — (p(t), b(t)), t € [0,7], verifying b(t) € B-1(S,),Vt € [0,7], must be 
identical to every other with this property. mbe 


3.8.1 Problems 


1. Let X C RN?¢ be a regular surface with codimension s. Let (U, Æ) be a regular system of 
local coordinates well adapted and orthogonal on X: with respect to the scalar product of Eq. 
(3.7.1). Denote B € 2 the coordinates of £ = E(B) € U. Set B = (Bv, Bn) E RS x RNS. 
Show that the change of coordinates (Bv, Bn) —> (Abo, ABn), with A and A being two 
s x s and (Nd — s) x (Nd— s) constant matrices, allows us to define a new system of local 
coordinates which is still well adapted and orthogonal on X. 


2. In the context of Problem 1, let W(@1,...,Gs) be a C® function of (@1,..., Bs, Bs+1, 
.., snd) independent of the last (Nd—s) coordinates. Suppose that Mj; = a3 (0 i,j 
iOPj 


= 1,...,s, is a s X s matrix which is positive definite. Show that there is a change of co- 
ordinates B — 3’ of the linear type considered in Problem 1 changing W into a function 
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such that Mij = 6;;, i,j = 1,,..,8. (Hint: Use A= 1,A = J = {orthogonal matrix diago- 

nalizing M} (see Appendix F, F.4). Then make a further change of coordinates of the same 
= i AL 

type with A = 1 and A = {diagonal matrix with diagonal elements (w, ?,..., Ws 7) where 

(w1,..., Ws) are the s eigenvalues of M.) 


3. Show that there is essentially no change in the proof of Proposition 13 if Eq. (3.8.11) is 
changed into 


= ai 
W(f1,---58s)= 32 P? + oB? +... +85). 
i=1 


4.* Alternatively to Problem 3, but with the same assumptions, show that there is a 
change of coordinates which, possibly restricting the size of U to U’ C U, changes 6 
into 8’, retaining the orthogonality and good adaptation of the 8’ coordinates, chang- 
ing W into 4 jeri Bi. (Hint: For B € Q, Q = (basis of (UZ)), let W(B) = 48? + 
Sey 7ije(B) GiB; Gc, where yije are suitable (2) functions symmetric in the indices i, j, £ 
(this assumption is not restrictive because of Problem 2 and of the Taylor-Lagrange theo- 
rem, Appendix B). Then define 6, = be + D5 k= fe jk(B)BiBk with f symmetric in i,j,k 
and of class C° in @ and impose 


1 l,s i 
58 + >> vij,e (B)Bi Bibe = 5B” 


ijl 


1 2 1 
=30 + JO Befej (BBB+ 5 Do fes Oey (OPB Brb Br 


£,j,k=1 L j,k,j’,k' 


Therefore, yje has to be equal to 


BE, Bjr Bri Bii Bri, 


1 
fik elb) + 5 5 fes jaska (B) Fe, jtk, (B) AA 


(41514155481) D* (9k) 


, 


where D* means that the monomial in square brackets has to “simplify” so that all the 
terms in the denominator cancel with some in the numerator. Show that for small 6, by 
the implicit functions theorem, the above relation allows us to determine f in terms of y 
and 6. Then, again by the implicit functions theorem, invert the relation between @ and 
B’ to complete the change of coordinates.) 


5.* Let t > x(t), t E R+ , be a motion of N points, with masses m1,...,mn > 0, which 
develops under the influence of an active force F(), conservative with potential energy 
V) € C@(RN4) bounded from below, and of an ideal constraint to a regular surface 
X C RN? with a codimension s. 

Let ĉo = x(0), no = x(0), and let 79 be any velocity vector such that y = no and 
call t + x(t) the motion with initial datum (tjo, o) of the same system moving under 
the influence of the same active force and of an approximate constraint model (X, W, A) 
verifying the assumptions of Proposition 13. Call Tọ = {supremum of the T’s such that 
lim,.400 X,(t) = x(t) uniformly for t € [0,T]}. Show that To = +00. (Hint: The part 
of Proposition 13 proved in this section says that To > 0. Suppose To < +00. Let £0 = 
x(To), No = x(To) and note that the energies of the motions x) and x are \ independent and 
coincide. Discuss the system’s motion in the coordinate system (U, =) around £0 which is 
well adapted and orthogonal to X and in which W admits a representation like Eq. (3.8.11). 
Show that in (U, Æ), X) verifies equations like Eqs. (3.8.24), (3.8.25), (3.8.26), and (3.8.31) 
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with some slight changes which do not affect the conclusion that lim,,4.. Xa (t) = x(t) 
as long as x(t) stays inside U at a positive distance from QU. Since the conservation of 
energy implies the bound of Eq. (3.8.14) on speed, it is clear that for large à, x)(t) and 
x(t) will stay at a positive distance from OU for all times in a neighborhood of To. Hence, 
To= +00.) 


6.* Show that a system of N point masses, with masses m1, ..., my > 0, bound by an ideal 
bilateral constraint to a regular surface X C R4 and also subject to a conservative active 
force with inferiorly bounded potential energy v(a), has (for all £o € X and 7o tangent to 
X) a unique global motion t > x(t), t € R+, such that x(0) = mo, x(0) = £o; i.e., a motion 
verifying Eq. (3.7.20) in every local system of regular coordinates (Hint: Use the energy 
conservation and the existence and uniqueness theorems for Eq. (3.7.20) following from its 
transformation into Eq. (3.8.37): energy conservation together with the semiboundedness 


of V(@) gives an a priori estimate.) 


3.9 Application to Rigid Motion. Konig’s Theorem 


The general perfection criterion for approximate constraints discussed in §3.8 
is interesting because it establishes perfection of some classes of constraint 
models. 

In this section, as an application of the results of §3.8, Proposition 13, it 
will be shown that a natural rigidity constraint model is approximately ideal. 

Consider the following model (X, W, A), which is one of the most important 
constraint models for N points. Let 4i; > 0 be given numbers defined for 
(i, j) E S = {subset of the set of pairs of different points in (1,...,N)}; let 
ci i ET C {1,..., N}, be a family of regular surfaces in RÌ; then define 


D= {EE = (E%,...,€0) E RIN, [EM — € | = Liz for i,j € 5; 


EÀ Eoi forie T}, (AR 

WE = F pyl — EP] = ty +F wile — ail”), (3.9.2) 

ijeS i€T 
where pij, Yi E C°(R), Wiz (0) = ¥i(0) = 0, and have a strict minimum at 
zero; the notation |€ — o;|? denotes a C% function on RÌ positive outside c; 
and near o;, equal to the square of the distance between € and o;. Here c; 
may also be a single point. 

(27, W, A) is a natural model of rigidity for some system points (those in 
S) and for permanence on a surface or on a point (if ø; is zero dimensional) 
for some of the system points (those in T). 

In applications, it is quite common to meet only constraints for which the 
above is a good model, when friction is neglected. It is not completely trivial 
to show that Eqs. (3.9.1) and (3.9.2) are an approximate ideal model in the 
sense of Definition 13, §3.7. 
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In this case, we shall examine, for simplicity, only the case in which Eq. 
(3.9.1)) is a “total rigidity” constraint, i.e., the case when S contains so many 
pairs (e.g. all) to allow only configurations which can be obtained by rigid 
motions of a single one or, at most, of finitely many. Nevertheless, we formulate 
the general result. 


14 Proposition. The model (¥,W, A) defined by Eqs. (3.9.1) and (8.9.2) 
is a model of an approximate ideal constraint for a system of N points with 
(arbitrary) masses my,...,mn > 0. 


Proor. (Case T = Ø, S such that X is a total rigidity constraint). The 
surface X, in the case under examination, decomposes into a finite number 
of connected parts, each representing a rigid system in the usual sense of the 
word.’ 


O 2 Fig.3.2 


---e-- 
w 


Fig.3.2. Example of two rigid disconnected configurations. 


Suppose N > 3, the N = 2 case having been already discussed in the Example 
in §3.8, p.187. Suppose also that the points 1,2, and 3 are not aligned in the 
configurations of X: the degenerate case of N aligned points could be treated 
likewise. 

The configurations €’ € X located on the same connected component of 
X shall be uniquely determined by the position G of the system baricenter 
in the “fixed” Cartesian reference frame (O;i,j,k) and by three orthogonal 
unit vectors (i1, i2, iz) fixed with the system (“co-moving”) and finally by the 


positions P,,..., Py of N points in the reference frame (G; i1, i2,i3). By the 
rigidity constraint, the points P,,..., Px will have coordinates (P; — G)e, € = 
1,2, i = 1,2,...,.N, which are given constants, V é’ in the same connected 


component of X, in the frame (G;i1, ig, is). Suppose to have fixed ig parallel 
to (P — Pı) and ig parallel to the plane (P,, P2, P3) (but orthogonal to is). 

To prove Proposition 14, it will be sufficient to build a system of coor- 
dinates, local near o € X, regular, well adapted, and orthogonal to X with 
respect to the scalar product of Eq. (3.7.1) and with the extra property that 
W, the constraint structure function, has the property of Eq. (3.8.1). 


oy my consist of several connected parts: for instance, if N = 4 and the distances of 
the points 0,1,2,3 are d(0, 1) = 1,d(0,2) = 1, d(1,2) = V2,d(1,3) = V2, d(2,3) = V2, 
respectively, then X contains two connected parts. The first consists of the configurations 
obtained by rotations and translations of the configuration in Fig.3.2(a) and the other 
of those in Fig. 3.2(b). 
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Without loss of generality, suppose that the plane of the first two axes in 


the co-moving frame (Go; iO, iO, if )), associated with £o € X, i.e. the plane 


(i o, i )), is not parallel to the plane (i,j). 

Let € be a configuration close to X: in general, € ¢ X and with € are asso- 
ciated 3N coordinates, obtained through the following construction. € can be 
determined by assigning a configuration €’ € X and the vectors kK®,..., KO?) 
providing the deviations of the points in € with respect to the corresponding 
points P,,..., Py in £’. The (3N +6) coordinates necessary to determine the 
3N components of «K,...,«°%) in the frame (G; i1, i2,i3) fixed with €’ and 
the six coordinates giving the position and orientation in space of (G; i1, ig, is), 
i.e., of €’, are redundant and six of them must be eliminated. 

Coordinates that can be used to determine (G;i1,i2,i3) are the three 
Cartesian coordinates of G in the fixed frame (0;i,j,k) and the three “Euler 
angles” (6, p, Y) of (G;i1, i2,i3) where n is the unit vector along the intersec- 
tion between the plane (i,j) and the plane (ii, i2), a arbitrarily oriented (the 
“node lines”). The angles 0 = i3k, y =in, 4 = ni, are illustrated in Fig. 
3.3. 


Fig. 3.3 


Fig.3.3.The Euler angles. 
The components in (0; i1, i2,is) of k, i3, n are, respectively: 
k =(sin ™ sin Y, sin 6 cos w, cos @), 
ig =(0, 0, 1), (3.9.3) 
=(cos 4%, sin w, 0) 


and will be useful in the following. 

To obtain a local system of regular coordinates near o, remove from the 
3N + 6 redundant coordinates, just introduced, six among them by imposing 
the following six restrictions: 


N 
Simin =0, (3.9.4) 
i=l 


N 
bee —G) Amn =0, (3.9.5) 
i=1 
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which signify that G is actually the baricenter of € as well as that of €’ and 
that the configuration €’ € X is so chosen that the system (min), of 
é’ € X (“quantities of deviation”) have a vanishing “angular momentum”. 
The above restrictions should be thought of as restrictions on the choice of 
the reference configuration €’ € X a priori arbitrary. 

The six coordinates that can be eliminated via Eq. (3.9.4), and Eq. (3.9.5) 
are, for instance, the first two components of «) in (G; i1,i2,i3), the three 
components of «?), and the first component of «) still in (G; i1, i2, iz). The 
“free coordinates” B = ((1,...,(@3n will then be (orderly enumerated): 


(P, KO, KO, KP, KP, KP, rey 0, P, p, (a)i, (a)2, (£a), 


where 8, y, w are the Euler angles of (ij, i2,i3) with respect to (i,j,k), while 
a? are the components in (G; i1, iz, iz) of the deviations «). 
Given the 3N coordinates 8, the configuration € = Æ (8) is built as follows: 


(i) £a = (83n-2, B3n-1, B3N) = Ba determines the baricenter G. 
(ii) Brot = (B3n—5, B3n—4, G3n-3) = (0, p, Y) determine the orientation of the 


axes ij, ig,i3. Therefore, the positions P,,..., Px of the N points of the aux- 
iliary configuration, called £’ above, are determined. 
(iii) The coordinates By = ((1,...,G3n—6) determine (K®,..., KO) and, 


hence, the positions in (G;i1, i2,i3) of the points labeled 4,5,...,N and, fur- 
thermore, the coordinates KP and KP, no) of KY), KO), 

(iv) The coordinates of «), as well as the remaining coordinates of k®, K, 
are determined from Eq. (3.9.4) and (3.9.5). Eq. (3.9.4) yields 


N 
Fo) Speman ial 8 ee y Mh kO (3.9.6) 


which, inserted into Eq. (3.9.5), yields 


N 
mi(Pi — Po) Aw +Y mi(P; — Po) AKO = 0 (3.9.7) 
i=3 
By scalar multiplication of Eq. (3.9.7) by (Pı — P2), it is 


N 
1=3 


which determines the value of kP. In fact, recalling that iz is orthogonal to 


the plane ij, ig and that the latter three points are not aligned, (P3— P2) Ai, - 
(Pi — P2) £0 so that Eq. (3.9.8) is a linear equation for kP (with non-zero 
coefficient in front of Ko), 

Once Kl) is completely determined, Eq. (3.9.7) unambiguously provides 
the first two components of k , because Eq. (3.9.7) only leaves the component 
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of «)) along (P, — P2) undetermined, which, however, is just K) (recall 
that i3 is parallel, by construction, to (P; — P2), i.e., it is already known to 
be 61). 

Finally, once k® and Kk®) are completely known, K? is derived from Eq. 
(3.9.5). 

It is now possible to check the invertibility, near o, of the transformation 
associating with B = (Bv, Brot, Ba) the configuration Æ (8) = € built follow- 
ing rules (i)-(iv). Such a transformation is also of class C% with non vanishing 
Jacobian matrix near o. However, we do not enter into the laborious analysis 
of the check of the regularity, invertibility, and non singularity of &: it does 
not present any conceptual problem. 

Hence, the transformation Æ establishes a regular system of local coordi- 
nates in some small enough neighborhood U of o € X. 

Clearly, the points in X AU are those described by 3; =... = G3n—6 = 0; 
i.e., (U, Æ) is adapted to X. Actually, (U, Æ) is well adapted and orthogonal on 
X, with respect to the scalar product of Eq. (3.7.1). To show this, try to find 
the kinetic matrix associated with (U, Æ) and Eq. (3.7.1). For this purpose the 
kinetic energy of a motion t — x(t) of N points has to be expressed through 
in terms motion t > b(t) = (by (t), brot, be (t)) £ S71 (x(t), assuming that 
the motion x takes place inside U for t € [t1, t2]. 

By the definition of the coordinates 3, one has, for t € [t1, t2]: 


3 
x(t) = balt) +Y (KP (t) + (P: — G)eie(t)) (3.9.9) 
l=1 


and by differentiation one finds 


Ln 


a (3.9.10) 


3 
x(t) = Be(t) + > (K (t) + (KO (t) + (P; - @e) 
f=1 


We will now use a kinematic formula giving a simple expression to the time 
derivative of three mutually orthogonal unit vectors which are time dependent: 


d ie(t 
AUPE re (3.9.11) 
dt 
here w = w(t) is a suitable vector called “angular velocity” of the triple 


(in, i2, iz). 10 


10 To understand Eq. (3.9.11), note that, in general, the space orientation of three mutually 
orthogonal axes i,,i2,i3 imagined as emerging from a fixed point 2 can only vary if 
its three Euler angles (8, p, Y) with respect to a fixed triple (i,j,k) change. If only 0 
varies, it means that the reference frame (9; i1, i2,i3) rotates around the node line (see 
Fig.3.3), and it is then clear a every point P co-moving with (Q; i1, i2,i3) has velocity 
vp =bn- (P — 9). This holds, in particular, for the extremities of (i1, iz, i3; hence: 
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A useful expression for w is, in terms of the Euler angles (see p.202, foot- 
note 10): 


w = Î + ġk + dis (3.9.12) 
Coming back to Eq. (3.9.10), we shall rewrite it by using Eq. (3.9.1) 


: z(t) ‘ 
x(t)=be +B +wA(K™ +(P;-G), (3.9.13) 
die =OnNiy, £=1,2,3, 
dt 


A similar argument shows that if i1, ig,i3 move because only y or w vary, then 


dig dig . 
—=¢kAin, or — = ġŅi3 ^ig, 
u7’ 2 q VAi 
More generally, if i1,i2,i3 vary because 0,y,~ simultaneously vary, it will be (by the 
differentiation rule of composed functions): 
di 
Me etry, =i 
dt 
with w given by n + gk + wig, i.e., Eq. (3.9.11). 
In connection with Eq. (3.9.11), it is natural to note one of its consequences: the 
relation between a motion t > P(t) in a frame (0; i,j, k) and the same motion in a frame 
(Q(t); i1 (t), i2 (t), i3(t)), time dependent. From the vector relation 


P(t) — O = (P(t) — 2()) + (20) — O) 


written componentwise as 
a(t)it+ y(t)j + z(t)k = x1 (t)i (t) + z2(t)i2 (t) + x3 (t)is (t) + a (t)i + y (t)j + z (t)k, 
with obvious notations, it follows, by differentiation, that 
dii di2 dig 


via) = vr) 4 +23—- +Vo,. 
e Ne gp Poe aE P 


where V(®) = #&(t)i + g(t)j + ż(t)k is the velocity of t > P(t) in (O;i,j,k), V™ is the 
velocity of same motion “relative” to (Q(t); i1 (t), ie(t), i3(t)), ie, WO = a1 (Hi (t) + 
t2(t)i2 (t) + #3(t)i3(¢) and Vo is the velocity of the motion t > Q(t) in (O;i,j,k), i.e., 
Vo =az(t)it+ y(t)j + 2(t)k. Then, by using Eq. (3.9.11): 


VOX) = VO 4w (x (t)i (t) +2 (t)i2 (t) +23 (t)i3(t)) + Vig = VW + (wA(P—-2)4+VoQ). 


The term in parentheses has the interpretation of the “drag velocity” that the point P 
would have if it were fixed in (2; i1, ig, ig); hence, the above formula reads “the absolute 
speed equals the sum of the relative speed plus the drag speed”. Furthermore, the velocity 
of a point P fixed in a moving frame (2; i1, i2, iz) is given by 


Vp =Vgo +wA(P- A), 


where Vp is the velocity in (0;i,j,k) of P, w is the angular velocity of the triplet 
(i1, i2,i3) in (O;i,j,k), and Vo is the speed of 2 in (O;i,j,k). The last relation is of 
great interest in the theory of rigid motion. 
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where we set K = = yo j k i(t ) (which differs from k® (t); in fact, it is 
the velocity of the i-th point relative to the moving frame, while & is its 


velocity relative to the fixed frame), and (P; — G) = ale — G)eic(t). 
It is now possible to compute the kinetic energy, using Eq. (3.9.13): 


~ ` 
z 5 mle? = 5 X m; (b2 +w (6 + (P; - G)) + a 
i=l F 


-4O mot? pi mle tn + (P.- G))} 


N N 
1 2,2 : (3) 
+ 5 D milk ) + > miba: wA (K\? + (Pi — G)) (3.9.14) 
SA EE) N | (i) 
+X miba k +) mw A (6 + (P; -G))-# 
i=1 i=l 


The fourth and fifth terms in the right-hand side vanish identically: which 
follows by taking the constant vectors out of the summations and recalling 
a definition of the baricenter (by which poe _~,mi(P; — G) = 0 as well as 

q. (3.9.4). To study the second and the sixth terms of the right-hand side 
a we will use the formula 


(a^b)-c=(b^c)-a=(c^a)-b (3.9.15) 
to note that 
N N 
V mw (8 +(P—G))- KP =w (Y m(t -GAR®), (3.9.16) 


and one can remark that the quantity within brackets in the right-hand of Eq. 
(3.9.16) is the angular momentum gë» “relative” to the frame (G; ij, iz, is) 
(also called the “internal angular Aomori?) and, furthermore, by Eq. 


(3.9.5) written componentwise in (G; i1,i2,i3), by the i definition and by 
the time independence of the components of (P;—G) in (G; i1, i2, i3) , it follows 


that Sa mi(P; — G)) A e = 0,!! so that 


; N : 
KG") = = Symi P -GARP = omen AK? (3.9.17) 


11 Let s = 1,2,3 then Eq. (3.9.5) gives 
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it is therefore true and, as it will be seen, important that KW =0 if k® = 
=) = 0, i.e., if the system is, at the time considered, on X. The second 


term of the right-hand side of Eq. (3.9.14) will be written as 


1 S j 2 
REES 


1 ; ; 
= Zw [Dome + (P; — G)) A (w A (K® + (P; — @)))] Ge) 
i=1 
= sw Iw) 
a f 
where Eq. (3.9.15) has been used, having defined 
dep 
Lw) = So mile + (P, -G)) AWA (x + (P, - G))). (3.9.19) 
i=1 
Then define 
1 
Tg AD mi)xe, “baricenter kinetic energy”, (3.9.20) 
i=1 
1A 0 
pin) 4 5 XO mi(ž )?, “internal kinetic energy”, (3.9.21) 
i=l 
KW w, “complementary” or “Coriolis” kinetic energy, (3.9.22) 
e 1 
Trot = ze I(w), “rotational kinetic energy”, (3.9.23) 
and remark that it has just been shown that 
T = To TO + T,ot + To. (3.9.24) 


When w = 0, this relation is called “König’s theorem”. 
From Eq. (3.9.24), it can be seen that the coordinate system defined by Æ 
near o is well adapted and orthogonal on X. In fact, one can note that at a 


Me 


d N ; , d 1,3 
=E im (K + (P-@)) Aw) =£ me Yk (P; — Gjer iv A ier) 
a re 


1 


>. 
ll 


1,3 N 
= Som SO (Pi — Der RY (in Aig)s = (om PHONE | 
i=1 3 


=1 ee" 


since (igs A ig)s is either 0 or +1, Vt, i.e. it has 0 t-derivative. 
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point € € X, the coordinates «),...,«°%), hence, the By’s, vanish (i.e., the 
B coordinates are adapted to X). Furthermore, if the motion t — x(t) happens 
to occupy the position € € X at a certain time to we have Te(to) = 0, by Eqs. 
(3.9.17) and (3.9.22). 

At the same time instant JT") (to), as one realizes from the determination 
of kD, KO, KO, via Eqs. (3.9.4) and (3.9.5), is a quadratic form in by (to) 
with coefficients only depending upon the structure of X via the coordinates 
(P; —G)e, i = 1,2,3, L= 1,..., N, which are given constants (€ independent, 
hence, By independent). 

Finally, Te(to) is a quadratic form in ba(to) with constant coefficients, 
while Tyor(to) is a quadratic form in the components of w in (G; i4, i2, is) 
with coefficients depending only on the structure of X via the (constant) 
coordinates (P; — G)e, = 1,2,3,i = 1,...,N [see, for more details, Eq. 
(3.9.29)]. Hence, since the components of w in (G; i1, iz, i3) are, by Eq. (3.9.3), 


wi =Å cosw + ġ sin 0 sin w 
w2 = — Îsin Y + ġsin 0 cos (3.9.25) 


w3 =% cos 0 + q) 


it follows that the rotation kinetic energy is a quadratic form in brot (to) — 
(6, Q, W)t=to with coefficients solely dependent on 0, p, Y [by Eq. (3.9.25)]. 

Hence, the quadratic forms defining T on X do not contain any mixed 
terms like or (by); (Brot); or (by):(bq);; therefore, the coordinate system is 
orthogonal on X (see Definition 12, §3.7, p.177). It is also well adapted by the 
above observed constancy of the coefficients of the quadratic form in by (to) 
expressing T™) (to). 

From the definition of W, it appears that W depends only upon «™),..., 
KA) through their components in (G; i1, iz, i3), i.e., only upon By [in fact, as 
already remarked, such components can be reconstructed from the By’s via 
Eqs. (3.9.4) and (3.9.5) and depend only on the By’s and do not depend on 


(Brot Ba)| % 
This concludes the perfection proof for the constraint model (X, W, A) in 
the rigid case considered above. mbe 


Observation. By deducing Eq. (3.9.24), it has been explicitly shown that the 
kinetic energy of a rigid body, i.e., of a motion of a system of N masses in 
R? constrained to keep fixed mutual distances, can be expressed in terms of 
six coordinates and their derivatives. If such coordinates are the baricenter 
coordinates xg and the three Euler angles (0, p, Ņ) of a co-moving frame 
(G3 i1,i2,i3) with respect to a fixed frame (O;i,j,k) and if w1, w2,w3 are the 
components in (G; ij, i2,i3) of the angular velocity [see Eqs. (3.9.12), (3.9.3), 
and (3.9.25), then there exists a 3 x 3 matrix I = (Jj;)i,j=1,2,3 such that 


3 
1 1 


ij=1 
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where M = D m; and T denotes the system kinetic energy. In fact, Eq. 
(3.9.26) follows from Eq. (3.9.24), since in this case k® = 0 (because the 
motion is rigid) and T°” = 0 = To, and from Eq. (3.9.18) showing 


1 ie 
Trot =z¥ Iw) = 5 mi [(P; - G) A (w A (P; = G))] w 
2 = (3.9.27) 
1 2 
-5 3 m;(w A (P; — G)) 


by Eq. (3.9.15); then Eq. (3.9.27) permits us to obtain Eq. (3.9.26) as follows. 
If 0; is the angle between w and (P; — G): 


(w A (P; — G)? = w? (P; — GP (sin 0;)? = w? (P; — G)? (1 — (cos 6;)?) 


— 2 (P.O? _ w: (Pi - G)? = w? (P. — G — (w- (P. - @))}2 
=w? (P; — G)? [1 (BO) ] =w” (P; - G} —- (w: (Pi - G)) 
3 3 3 
= (9542) DOB - Qt) — SO wwe (Pi - Q)e(Pi - Qe 
F v= l e= (3.9.28) 
= 5 wewe (OR — G) bee — (P; — G)e(P; — G)e) 
0U=1 pant 
hence, 
N 3 
Iw = 5 mi{ (Ò (P; S G)2) bew — (P; — G)e(Pi- G)e)}, (3.9.29) 
zi g 


which are constants, V,” = 1,2,3, characteristic of the rigid body because 
such are the components (P; — G)e, = 1,...,N,¢ = 1,2,3, of the vectors 
(P; — G) in the co-moving frame (G; i1,i2,i3). We shall come back to Eqs. 
(3.9.26) and (3.9.29), deducing them independently of the constraint theory, 
to help the readers who have not paid attention to the proofs of this section. 


3.9.1 Exercises and Problems 


1. Suppose that the reference system (G; i1, i2,i3), with origin at the baricenter of a system 
of masses, has a purely translational motion in the reference system (0;i,j,k). Show that 
the kinetic energy is T = Tg + TË»). 


2. Let t > w(t),t E R4, be the angular velocity for the triplet of orthogonal unit vectors 
i1, i2,i3 moving in the reference frame (0;i,j,k). Let t > y(t), t E R+, be a R3-valued 
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function and write it as y(t) = yey ye(t)ig(t). Define y(t) = ea Ye(t)ig(t) and show 
that y = y +w ^y by using oe = w ^ ig. Show also that w = w. 


3. Compute the components of w, Eq. (3.9.12), in (0;i,j,k) in terms of the Euler angles 
and their derivatives. 


4. Evaluate the matrix Ije for the rigid system in Fig. 3.2(a), assuming mo = mı = m2 = 
m3 = 1 or mo 1, mi 2, m2 = 3, m3 = 4, taking for the moving frame the one with 
axes parallel to those in Fig. 3.2(a) and origin in G (Hint: If the direct computation looks 
cumbersome, replace G by O using Problems 5 and 6 below.) 


5. Consider a rigid system constrained to have one of its points fixed at the origin of the fixed 
frame of reference (O;i,j,k). Show that if (O;i1,i3,i3) is a co-moving frame, the kinetic 
energy can be expressed as T = 4 Vee Joe wWewy, where Jeg are constants depending only 
on the body structure. 


6. In the context of Problem 5, consider the cases when O = G and when O # G, calling Ig, 
or Jeg the matrix expressing the kinetic energy in the frames (O;i1, i2, i3) or (G; i1, iz, iz). 
Show that if M = DÀ} mi, 


Jw = Loe + M[(G — O} deer — (G — O)e(G— O) oi]. 


3.10 General Considerations on the Theory of 
Constraints 


The approximate constraint theory, in the analysis of Proposition 13, §3.8, 
and Proposition 14, §3.9, still contains some unsatisfactory aspects that it is 
useful to mention explicitly. 

In applications in which a certain model (X, W, A) of approximately ideal 
constraint is a good model, the rigidity parameter A has a well-defined value 
A < +00 which is fixed and, therefore, cannot tend to +co. 

Therefore, the problem arises of how to estimate, in terms of A, the error 
encountered when approximating the “real motions” t > x)(t) with their 
limits as å — +00 (which are described by the equations of motion relative to 
ideal constraints, because (X, W, A) is supposed to be an approximately ideal 
constraint, i.e., by “simple” equations). 

The theory of §3.8, if one carefully looks at the formulas derived in the 
proof, also provides some estimates of the errors made in the mentioned ap- 
proximation. 

However, it is sufficient to simply look at the calculations made there to 
realize that if N is a number of the order of magnitude of a few dozens (not 
to speak of the cases when it is on the order of Avogadro’s number, as is 
sometimes the case), such estimates become ridiculously rough for reasonable 
values of À and reasonable models of W. 

As usual, the problem of finding good error estimates is a problem that 
should not be posed in too great a generality but should be discussed in con- 
nection with precise and concrete questions of a physical nature concerning 
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the behavior of physical entities which, in each case, appear as interesting. 
Even so, it remains a very difficult question and is a typical problem in statis- 
tical mechanics. Except in a few simple cases, it is an essentially open problem 
from a mathematical viewpoint. 

Physicists and engineers have elaborated theories, mathematically non rig- 
orous consequences of the dynamics of point masses, which allow them to 
evaluate the errors involved in the perfect constraint approximation in a rea- 
sonable way, often experimentally correct.!? 

However, it is often only through recourse to experiments that one is able 
to understand whether a certain constraint can or cannot be approximated 
by an ideal one. 

It is good for the student to keep the above considerations in mind while 
solving the standard book-made problems concerning the constrained motions 
in order to appreciate their often purely didactic and abstract nature. 

The above discussion, which we will not continue, gives an idea of the depth 
of the ideal constraint notion, and it can perhaps be useful to understand why 
long and learned discussions on the argument often take place. So many and 
so diverse are these arguments that they may leave those who realize their 
existence for the first time quite surprised. 

Other problems naturally arise in the theory of the holonomous con- 
straints. Some of them are: 

(i) When an approximate constraint (2’, W, A) is not perfect, how can the 
motion be described in the limit A — +00? Is it possible, as the considerations 
in §3.6 seem to suggest, to treat the constraint in this limit as ideal in the 
sense of 83.5, modifying the potential energy of the active forces, possibly as 
a function of the initial datum? See the example of §3.6, following Definition 
10. 

(ii) In case (i), how can we find the active forces? And how can we estimate 
the errors involved in the approximation À = +00? 

(iii) How can we treat the case when the constraint model (X, W, A) is 
ideal but the system moves under the influence of a force which is the sum of 
the constraint force, with potential energy AW, and a force law, in C@(R%) 


FO = RO (ED, E0) (3.10.1) 


which is not necessarily conservative? 

(iv) This is the same as (iii), replacing the constraint model by an approx- 
imate conservative model which is not approximately ideal. 

(v) This is the same as (i) and (ii) in the situation described by (iii). 


12 For instance, elasticity theory has, among other theories, this scope. Of course, elasticity 
theory can be set up as a mathematically rigorous theory in itself: what is non rigorous 
is the connection between elasticity theory and the above microscopic theory of con- 
straints. In other words, elasticity theory is itself a mathematical model which in this 
case “models” another mathematical model: even such things can happen! 
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The preceding problems are not easy and are open problems to some extent 
(in the sense that there do not seem to be in the literature any interesting 
general propositions about them) except problem (iii) which is essentially 
completely solved by the following proposition, proved exactly in the same 
way as the analogous Proposition 13, §3.8, p.186: 


15 Proposition. Let (X,W, A) be a model for an ideal approximate bilateral 
s-codimensional constraint for N points in RS, with masses m1,..., My > 0. 
Consider an initial datum (no, £o) € R?N? such that ĉo € X. Let t > x(t) 
be the motion that follows this initial datum and develops under the influence 
of the field of conservative forces with potential energy AW and of a field 
FM € CR (RNI) of uniformly bounded forces, not necessarily conservative. 
Then the limit 


Jim xa (t) = x(t) (3.10.2) 
exists for every t E R and it is a motion constrained to X with initial datum 


x(0) =£, (0) = ny (3.10.3) 
[see Eqs. (3.7.24) and (3.7.26)]. Suppose that for t € [ti,t2] the motion x 


dwells in a neighborhood U where a system (U, Æ) of local regular coordinates 
adapted to X is established. Then x is described in the basis 2 for (U, Æ) by 


a motion t — b(t) verifying: 


by (t) = bo(t) =... = b(t) = 0, (3.10.4) 
(FEBO, be) -7 = (b(t), b()) = (b(t), (3.10.5) 


Vi=s+1,...,Nd, where 


T(a, B) = z XO gij(B)aiay, (œ, p) ER XN (3.10.6) 


if g is the kinetic matriz associated with the system (U, Æ) and 


Blk) 


N 
$;(8) = X FO (E(B) - 
k=1 


Observation. The functions in Eq. (3.10.7) on 2 are called the “force com- 
ponents” of the force F® = (FOOM,...,F()) in the reference system 
(U, Æ). The proof of Proposition 15 is a repetition of that of Proposition 13. 


A final comment on the theorems of §3.8 and §3.10 concludes this section. 
The condition o € X appears to be somewhat unnatural, and one would like 
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to change it to “£o close enough to X”. However, the problem is what is meant 
by “close enough”? 

It is quite clear that the closeness notion should be dependent: in fact, 
we shall call £o close to X only if the energy AW (£o) is not too large; i.e., if 
the initial deviation out of X does not involve “too large constraint forces” or 
“too large elastic deformation energy”. 

It is then clear that it will be possible to try to prove propositions analo- 
gous to Proposition 13 or Proposition 15 by replacing the hypothesis €) € X 
with the hypothesis that the position of the initial datum is a function 
of A, €o(A), such that the limit limy.4. €0o(A) = o € X. In this case, 
AW (€0(A))/A ~z 9 (Le., the initial “constraint deformation energy” is not 
too large, being of lower order with respect to A, which is the order of the 
energy of a A-independent deformation). 

The proof of the analogues of Propositions 13 and 15 would be identical 
under these more general assumptions: this could be realized via a detailed 
examination of their proofs. 


Historical Note: The idea that the constrained systems, ideal or not, could be 
thought of as limiting cases of non constrained systems subject to strong forces 
is naturally ancient. However, to the best of this author’s knowledge, it has 
been written down in the form of a precise theorem to be interpreted as a proof 
of the least-action principle in [1] (p.80-82). Here the idea is expressed and it 
is shown how the least-action principle can be deduced through Proposition 
13, §3.8, p.186. This is, in my opinion, the most interesting and deepest of 
the “proofs” of the least-action principle (and, hence, of the virtual-work 
principle). There exist other proofs, sometimes very ingenious, which, however, 
are never more than pseudo-proofs in the sense well described by E. Mach 
([31], e.g., in Chapter IIT, §5.6). 


3.11 Equations of Hamilton and Lagrange. Analytical 
Mechanics 


Before beginning the study of concrete mechanical problems, it is convenient 
to deduce from what has already been seen some abstract mathematical struc- 
ture naturally arising in the context of constraint theory and the least-action 
principle. 

14 Definition. Let U C Rf x R! x R be an open set and let LE C®(U) be 
a real-valued function. L will be called a “regular Lagrangian function” on U 
if the map Æ transforming the point (a,q,t) E€ U into the point 


(7,q,t) Ee R'xR'XR (3.11.1) 
with 
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i 0a; 
maps the neighborhood U into a neighborhood V C RE x R! x R, V = B(U) 
invertibly and with a non vanishing Jacobian (“nonsingularly” ).1° 


If L is a regular Lagrangian on U, the equations for the motion t — 
(q(t), a(t), t) in U, 


(a, q, t) (3.11.2) 


d/joOLl.. OL. 
(à q, t) = aq q,t), (3.11.3) 


are called the “Lagrangian differential equations” for the Lagrangian L. 
Since the map (8.11.1) does not really involve q, the above definition makes 
sense without change if U is an open subset of RE x TE x R or of Rf x (T" x 
R®) xR, with lı +L = £, provided C® (U) is understood in the natural sense 
following Definition 18, p.101, §2.21. 

In these cases, V will have to be a subset of RE x TE x R or of RE x (T^ x 
R) x R, hı + lo = L, and the points on the tori are to be thought of as 
described in “angular coordinates”, see Definition 12, p.100, §2.21. 


Observations: 
(1) The usefulness of the clumsy-looking extension appearing in the second 
part of the definition can be understood by noting that, for instance, a point, 
with mass m > 0, bound to a vertically placed circle with radius R by an ideal 
constraint and subject to gravity has, if y is the natural angular coordinate on 
the circle thought of as 71, a Lagrangian description in the sense of Definition 
14 in terms of L(a, y,t) = 4ma? +mgRcosy. In this case, U = R x T! x 
R and Eq. (3.11.3) becomes the pendulum equation (g being gravitational 
acceleration). 

Similarly, a free particle ideally bound to a circle will be described on 
Rx T! x R by Lola, p, t) = $ma?. 

Hence, when the surface X generated by an ideal constraint is topologically 
a torus, we have the possibility of using “global angular coordinates” without 
having to cover X, to describe the motions on X, with several local systems 
of regular coordinates. E 
(2) When £ does not depend explicitly on time, i.e., L(a, B, t) = L(a, ß), 
V(a, B, t) € U, for some L, we say that £ is “time independent” and we shall 
write it without the variable t. 


The following proposition holds. 


16 Proposition. Let L be a regular Lagrangian on an open subset U C 
RE x RE x R (or RE x (Ro x T2) x R, L + bo = £, & > 0), and let t > 
(q(t), q(t), t) E€ U be a motion defined for t € |ti, t2], verifying Eq. (8.11.3). 
Setting 


13 The Jacobian determinant coincides with the determinant of the matrix Jij = 
a? L(a,q,t) 


Boda, 7d EAA 
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(a(p,q,t),a,t) = =! (p,q, t), (3.11.4) 


H(p,q,t = Sr p,q; t) — L(a(p, q, t), a(t), t) — L(a(p, q, t), q, t), 


(3.11.5) 
for (p,q,t) E€ V (see Definition 14), the motion in V, image of the preceding 
motion in U via Eq. (3.11.1), t > (p(t), q(t), t) = = (4(t), q(t), t), verifies the 
equations: 


i= SF ptt). alt).0) eee (3.11.6) 
qi 
qi = =~ (P (t), q(t), t), i=1,...,0 (3.11.7) 


Observation. Note that Eqs. (3.11.6) and (3.11.7) are equations to which the 
local existence, uniqueness, and regularity theorems for differential equations 
can be immediately applied; this is not the case for Eq. (3.11.3), where the 
highest derivatives do not necessarily appear with constant coefficients: see 
also the final part of the proof of §3.8 p.196, to realize that this is really an 
inconvenience. 


PROOF. We only discuss the case U C R! x Rf x R, leaving the other two 
cases (U C Rf x T x R or U C RE x (R x T) x Ri +f = 0) as 
exercises. In any case, the proof is just an algebraic check. Equation (3.11.3) 
can be written by Eq. (3.11.2) as 


vill) = ZE (all). a(0)-) a= let (3.11.8) 


but, by Eq. (3.11.5), Vi=1,...,2, 


OH OL da; wr Bay 
Ogi -= -DE 1 Oa; Oni pe J Odi” (3.11.9) 


and by Eqs. (3.11.2) and (3.11.4), implying p; = be (a(p, q, t), a(t), t), the 


two sums cancel and Eqs. (3.11.8) and (3.11.9) become Eq. (3.11.6). 
Furthermore, by Eqs. (3.11.5) and (3.11.2), 


OH L 0a; 
Opi 2 ” Opi fs l ) 
i.e., Eq. (3.11.7) follows. mbe 


The above proposition suggests a definition. 
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15 Definition. Let V be an open set in W x RE x R (or RE x T’ x R or 
RE x (R x T2) xR, lı +b = £, & > 0) and let H be a real-valued C® (V) 
function. H will be said to be a “regular Hamiltonian function” on V if the 
map W transforming the point (nm, B,t) € V into the point 


(mr, B,t) = (a, B, t) (3.11.11) 
with 5H 
Qi = FU ea ree A (3.11.12) 


maps V in a neighborhood U C RE xR (or REx Tx R or REx (R^ x T®) x 
R, litle = £, l2 > 0, respectively), U = V(V), invertibly and nonsingularly.’ 

If H is a regular Hamiltonian on V, the equations for the motion t — 
(p(t), a(t),t) in V: 


lt) = -2E P(O, a(t) (3.11.13) 
ul) = @l).at1 (3.11.14) 


are called the “Hamiltonian differential equations” for the Hamiltonian H. 
A proposition similar to Proposition 16 holds. 


17 Proposition. Let H be a regular Hamiltonian function on V C RR x 
R (or RE x TEx R or RE x (R" x T) x R, L + bo = L, h > 0). Let 
t — (p(t), q(t), t) be a motion in V defined for t € |t1,t2] and verifying Eqs. 
(3.11.13) and (3.11.14). Setting 


(z(a, B, t), B, t) = (a, 8, t) (3.11.15) 
for (a, B,t) EU (see definition 15) and 


L 
L(a, B,t) = X mla, B, ta; E H(n(a,ß,t), B, t) (3.11.16) 
j=1 


the motion in U, t > (a(t), q(t), t) = YW +(p(t), q(t), t), t € [t1, te], verifies 
the equations: 


a= AN); (3.11.17) 
T(E awaa.) = 5, (aval), i=1,...,0 (3.11.18) 


14 As in Definition 14, this definition makes sense without change if V is an open subset of 
RE x TE x R or RE x (R1 x T2) X R, L + £2 = £ (see Definition 14 and Observation 
(1) to Definition 14). 

i.e., with non vanishing Jacobian determinant. Such a Jacobian determinant is easily 
seen to be the determinant of the matrix Jij = (0? H/On,O7;), cE el ee oe 


15 
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ProoF. The proof is basically identical to that of Proposition 16. 


Observations. 

(1) Propositions 16 and 17 show that “a system of Lagrangian equations, 
regular on U, is equivalent to a system of Hamiltonian equations, regular on 
V, and vice versa”. The sets U and V are related by the relations: 

(i) V is the image of U via the map 


= OL 
= : (ax, B,t) EA (7, B, t) = (ga (© b0 B:t) (3.11.19) 
where £ is the Lagrangian function on U. 
(ii) U is the image of V via the map 
OH 
Y: (m, 3, t) = (a, B, t) E (gy mP t) (3.11.20) 


where H is the Hamiltonian function on V corresponding to £. 
(iii) £ and the corresponding H are related by 


£ 


A(x, GB, t) = 5 miailm, B,t) z L(alr, B,t), B, t), (3.11.21) 
i=1 
£ 
L(a, B, t) = 5 Ti (a, B, thay aag H(n(a,ß, t), B, t). (3.11.22) 
i=1 


La, B,t) = I XC 9ij(B)aia; — V (B) (3.11.23) 


where g is a positive definite matrix. £ is usually defined in neighborhoods 
U =R x Uo xR, Uo C T (or Uo C Re x Te, li +p = £, 4i > 0) open. Eq. 
(3.11.23) is regular on U because Eq. (3.11.19) [or Eq. (3.11.2)] becomes 


£ 
m=} gB) i=l.1, (3.11.24) 
j=1 


which is invertible and nonsingular if thought of as defining [see Eq. (3.11.19)] 
a map of U onto V = R! x Up x R: this is so by virtue of Proposition 11, §3.7, 
p.182, on the kinetic matrices (implying det g(6) 4 0). 

The Hamiltonian function associated with Eq. (3.11.23) is, by Eqs. (3.11.24) 
and (3.11.21), 


L 
H(,B.t) = 5 (9B) us T T + VB), (3.11.25) 


j=l 


216 3 Systems with Many Degrees of Freedom 


where g(3)~' is the inverse matrix to g(@). 
(3) In the case of the Lagrangian (3.11.23), Eq. (3.11.2) [i-e., Eq. (3.11.24)] is 
simply the condition expressing that the gradient of the function of a € Rf, 


a> mw-a—L(a,f,t) (3.11.26) 


vanishes. One can check that for such a value of a, Eq. (3.11.26) actually 
reaches its only absolute maximum [Note that in the case considered here, 
Eq. (3.11.26) is a quadratic form in a plus a linear form in a.] So 


A(x, B,t) = ax (m-a -— L(a, B,t)) (3.11.27) 
acR 

when £ is given by Eq. (3.11.23) or, more generally, whenever the function 

of Eq. (3.11.26) has only one stationarity point in a which is a maximum 

(exercise). Similarly, 


L(a, B,t) = max (r -a — H(z, B,t)) (3.11.28) 
TER! 
if H is given by Eq. (3.11.25) or, more generally, whenever the function of m 
inside the parenthesis on the right-hand side has only one stationarity point 
in a which is a maximum. 
Equations (3.11.27) and (3.11.28) are often called “Legendre’s duality” or 
“Legendre’s transformations” on £ or H, respectively. 
(4) Definitions 14 and 15 and Propositions 16 and 17 assume a simpler 
form if one is interested in Lagrangian or Hamiltonian functions not ex- 
plicitly depending on time and defined on sets U or V of the form Ü x J 
or V x J with J = {open interval in R} and U,V C R! or R? x T! or 
RE x (R& x T®), 4 + L = £, open sets. 
In such cases, the t parameter can be eliminated from the definition of the 
sets U,V (replacing them by U,V) and of the maps 2,W in Definitions 14 
and 15, and £ or H will be functions in cx“) or on Cx (V). £ or H will 
be called “time-independent” Lagrangian or Hamiltonian functions and they 
generate autonomous Lagrangian or Hamiltonian equations via Eqs. (3.11.3), 
(3.11.6), and (3.11.7). 
When V = R! x Uo, the space V is usually called the “phase space” if it 
is regarded as the initial data space for some time-independent Hamiltonian 
equations: this name is often used even when V is just an open set (not 
necessarily of the form Rf x Uo). Similarly, when Ü =R! x Uo, the space 
U is called the “data space” if it is regarded as the initial data space for a 
time-independent Lagrangian equation. 

The formal wording of the above concepts is straightforward and will be 
left to the reader. We shall freely refer to time-independent Lagrangian or 
Hamiltonian functions and equations on the data space or the phase space. 

It is interesting to note the following abstract version of the energy con- 
servation theorem. 


3.11 Analytical Mechanics 217 


18 proposition. Consider a system of Hamiltonian equations in a neighbor- 
hood U = R x Uo x R, Uo C RE, (or Uo CT! or Uo C Ri xT”, Li +b, =£) 
and let (7,8,t) —> H(n,B,t) be the (regular) Hamiltonian function. If 
t > (p(t), q(t),t) € U, t € [t1, te], is a motion verifying in U the Hamiltonian 
equations, then 

d OH 
Hence, if H is time independent, i.e., H(n,B,t) = h(n, B) for some h € 
C~(R* x Up), Eq. (3.11.29) implies the existence of a constant E, depending 
on the motion under investigation, such that 


A(p(t),a(t)) =E, — t € [t1, te]. (3.11.30) 


Observations. 

(1) In the cases met so far, the Lagrange function had the form of Eq. (3.11.23) 
and Sonn giz (Q(t) )qi(t)q,;(t) had the interpretation of kinetic energy T(t) 
of the motion, while V(q(t)) had the interpretation of potential energy V (t). 
Furthermore, the relation between p(t) and q(t) was [Eq. (3.11.24)]: 


p(t) = g(a) a(t). (3.11.31) 
Then, by Eq. (3.11.31), 


£ 
5 D CaO OO = 5 D AOAOGA = TE). (81132 


Hence Eq. (3.11.30) becomes 


T(t) + V(t) = E. (3.11.33) 


(2) When a system of N points without constraints, with the Lagrangian 
function 


2 
1 2 
L£(a,B) = 5 Dro -V(p) (3.11.34) 
is considered, we see that m = (#,...,0¢)) with r® = ma®,i = 


1,...,N, so that if t > x(t), t € [t1, te], is a system motion: 


which explains the name “generalized momenta” given to the variables 7; in 
general. 
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The variables 7; are also called the “conjugated momenta” with respect to 
Bi, i= 1,...,£, and the 2¢ variables (m, 3) are called “canonical” variables in 
the phase space of a Hamiltonian equation. 

The word conjugation is used here because of the obvious symmetric role 
played by the p and q variables in the Hamiltonian equations. This symmetry 
could be used to build even more abstract structures associated with the 
theory of mechanical equations of motion for conservative systems; however, 
this aim will not be pursued here. 


PROOF. In fact, 


SHPO a0) = (oO. alt).) +o (Sw + SE a) 
e (3.11.36) 
= om (Plt), q(t), t) + > ipi — pi) = om (pit), q(t), t) 
mbe 


Another consequence, already mentioned in Problem 10, §2.24, p.137, of 
the symmetry of Hamiltonian equations is the following: 


19 Proposition. Let V = R! x Up with Uo open subset of RE (or R® x 
T bl +b = L). Leth € C®(R! x Uo) be a time-independent regular Hamil- 
tonian function.'® Call Si(m, B) the point into which the initial datum (m, B) 
evolves through the equations: 


i ðh A Oh 
P =—54(P.4), q= FPD- (3.11.37) 


Suppose that for T € [0, t], the data (n, B) € ACU are such that S(r, B) € 
U, ie., SACU ifr € [0,t], i.e. the evolution of the points in A takes place 
inside U for all r € [0,t], and suppose that A is measurable; then 


volume S; A = dpdq = volume A. (3.11.38) 
S+A 


Observation. This is read by saying “the Hamiltonian flow preserves the phase 
space volume” and it is called the “Liouville theorem”. 


PrRooF. This is a consequence of the fact that the Hamiltonian equations have 
2 2 

zero divergence: DA -zr + DF a = 0 (see the hint to Problem 

10, §2.24, where the argument is given in detail). 


mbe 


16 see observation (4) to Proposition 17; the function H(r,6,t) = h(z, B) is a regular 


Hamiltonian on =V x R in the sense of the Definition 15, p.214. 
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A corollary to the above proposition is the following. 


20 Proposition. Given the same assumptions as in Proposition 19, suppose, 
also, that the set of the (n, B) such that h(n, B) < E is a set Qe whose closure 
in RE x RE (or RE x T! or RE x (R x T'?), 4 + lo = L) is contained in 
V and is bounded. Then given any (To, Bo) E€ Qe, to > 0 and a neighborhood 
W C Np of (To, Bo), there exists t > to such that SW OW £ Ô. 


Observation. So “if the energy E surface is bounded”, close to every point 
“inside” it there is another point coming “as close” after any given time: this 
is Poincaré’s recursion theorem. 

If the system contains N (e.g. ~ 10°4) points enclosed in a box (modeled by 
a potential tending very quickly to +00 outside the box) and if it is initially 
in a configuration € in which all points are confined to the left half of the 
box (say), then as close to it as we wish there is another configuration which 
evolves so that, waiting “long enough”, we shall be surprised to see that all 
the particles will again occupy the left half of the box. This nice paradox 
(“Zermelo’s paradox”) gave some problems to Boltzmann. 


PRooF. The proof is a very simple consequence of Proposition 19 and is 
described in greater generality (for divergenceless differential equations) in 
Problem 11, §2.24, p.138 (see hint). mbe 


In connection with the Hamiltonian equations, the notion of “canonical 
transformation” plays an important role. A transformation of coordinates 
is canonical when it leaves the structure of the Hamiltonian equations un- 
changed. Such a notion has remarkable importance in the algorithms used in 
the theory of perturbations, which we shall introduce in Chapter 5. 


16 Definition. Let V be an open set in RE x RE XR (or in RÉ x T’x R 
or RE x (R" x T) x R, L + fo = L) and let H be a regular Hamiltonian 
function on V (see Definition 15). 

Suppose that on V a C® map C is defined such that: 

(i) The image of (p,q,t) E V has the form (m, K,t) =C(p,q,t), i.e., C is an 
“isochronous map” (since it does not affect t). 

(ii) The map C maps V onto W =C(V), which is an open subset of RfXREXR 
(or in RE x TE XR or REx (Ro x T2) x R, +44 = £) and it is invertible 
and nonsingular, \" i.e., C is a regular change of coordinates on V. 

(itt) There is a real-valued function H’ € C®(W) such that ift — (p(t), a(t), t) 
E V,t € [t1, te], is any motion in V verifying the Hamiltonian equations with 
Hamiltonian H, then t — (a(t), x(t), t) = C(p(t), q(t), t) € W, t € [t1, tal, 
verifies the Hamiltonian equations relative to H’ and vice versa. One says 
that C is a “canonical transformation of V in W with respect to the pair of 
conjugate Hamiltonians H and H'”. 


17 i.e., its Jacobian determinant does not vanish. Hence, C~! has the same properties by 


the implicit function theorem. 
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Observation. In general, if a map C is canonical for the pair H, H’, it will not 
be canonical for the pair (H, H”) no matter how H” is chosen, if H” # H’ 
(for an example, see Problem 38, at the end of this section). 

It is therefore tempting to call “completely canonical” a map C between 
V and W such that for any choice of a Hamiltonian function H on V, one 
can find a conjugated Hamiltonian function H’ on W in some standard way 
(Levi-Civita). 

We shall make the notion of “complete canonicity” precise only in the 
simple case of “time-independent” canonical transformations. 


17 Definition. Let V = V x R be an open subset of RE x RE x R (or of 
RE x TE x R or of RE x (R x T”) x Rf, & + fo = £) and let C have 
the form C(p, q, t) = (C(p, q), t) u with E being a regular change of coordi- 
nates between V and its image W W C REXR (or W C Rx T® or 
WCR x (Ro x T2), 0 +, = £). 

We shall say that C isa “completely canonical time-independent” or, sim- 
ply, a “completely canonical” transformation if C isa transformation which 
conjugates canonically every regular Hamiltonian function H on V with 


— 


H' (n,k, t) Z H(i n,k), t), Ym, 6) eW. (3.11.39) 


Observation. In other words, a time-independent completely canonical trans- 
formation is one with the property that any Hamiltonian function is conju- 
gated to itself computed in the new coordinates. 


The following proposition provides a very general method of construction 
of canonical transformations and of completely canonical transformations. 


21 Proposition. Let H be a regular Hamiltonian function on the open set 
V CRE X RE xR (orin RE xT! or REx (RE x R?) X R, L +b% = L). Let 
F €C™®(R**?) be a function denoted 


(q, k, t) > F(q, kK, t) ER (3.11.40) 
Fori=1,...,@, sett =t and 


OF OF 
Pi = Ig kK,t), Tmi = Sg TRN (3.11.41) 


and assume that Eq. (3.11.41) establishes a one-to-one map Cp between 
(p,q,t) € V and (x,«,t) = Cr(p,q,t) € W. Suppose that Cr is a reg- 
ular change of coordinates! between V and W (W C REX Rox R or 
RE x TE xX R or RE x (R x R) x RL + bo = £). Then if we define 
(p(n, «,t), q(T, K, t)) = Cr’ (m, r,t) and 


1s i.e., it is one-to-one and with non vanishing Jacobian determinant. 
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A E E A aK 


ap (a(t t) t), (3.11.42) 


the map Cr is a canonical transformation of V onto W with respect to H and 
H. 

Observations. 

(1) Note that F is required to be in C%(R4+1) even when V is in Rf x 
TE x R or in RE x (R& x T?) x Ri + fo = &. Recall that the points 
on a torus are, in the present contexts, always thought of as described in 
“flat or angular coordinates”, i.e., by thinking of the torus J as obtained by 
identifying mod 27 the points of R! (see Definition 14, p.211). 

(2) From the proof of Proposition 21, it will follow that other coordinate 
transformations analogous to Eq. (3.11.41) are canonical: for instance, from a 
function 6 € C® (R*+), 


(qa, T,t)—> ®(q,7,t) ER, (3.11.43) 
one builds a canonical transformation!’ Cg by setting, Vi = 1,2,...,£, 
op op 
Pi = —(q,7,t), Ki = (q, m, t), (3.11.44) 
oqi On; 
/ op 
H(t) = H (pl. r,t) a(r, r,t), t) + Elame) mt) (8-11.45) 


where we denote (p(n, K, t), q(7, r,t), t) = Colm, r,t) 
Similarly, with analogous notations, if Y € C® (R? +1), setting 


(p,«,t) > UP, k, t) ER, (3.11.46) 
one defines a canonical transformation!? Cy by setting Vi = 1,...,& 
Ow Ow 
Gi = ð -(p, K, t), Ti = -5p (Pt), 
ae í (3.11.47) 
H' =H + — 
OR 
and if R € C®(R2+4), 
(p,7,t) > R(p,7,t) ER, (3.11.48) 
defines a canonical transformation!® Cr by setting, Vi = 1,2,..., 4, 


19 between regions where the regularity, invertibility and nonsingularity requirements for 
the maps Cg (or, see below, Cy, Cr) similar to those put on Cp are verified. Such regions 
V,W may be very small or even nonexistent: in the last cases no canonical transformation 
is really associated with F,&,W, R. 
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OR OR 
Gi = ð -(p, 7, t), Ki = gz P Tt), 
a i (3.11.49) 
apas 
H' =H + aE 


(3) However, it will appear that the class of canonical transformations built 
starting from F as described by Proposition 20 is not essentially less ample 
than that obtained by adding to it the canonical transformations associated 
with the functions $, Y and R as described in the preceding observation. 
With some natural exceptions, to every F it is possible to associate a 6, a WV, 
and an R producing the same canonical transformation. 

(4) If F is time independent, then Cr defines a completely canonical (time- 
independent) map. 

(5) F is in general called a “generating function” of Cr. So one calls also the 
functions ®, Y, R above. 


PROOF. The proof by direct check is of course possible. However, if performed 
straightforwardly, it quickly becomes quite intricate. It is certainly more con- 
venient to proceed in the following elegant fashion, which also exhibits a new 
form of the least-action principle: the “Hamilton’s principle”. 

Let MY = M¢,1,(P1, G1, t; P2, G2, t2; V) = {set of the motions in V having 
the form t > m(t) = (p(t), q(t),t) € V, t € [t1, te] and such that p(t) = 
Pi,q4(t1) = qi, p(t2) = p2,q(t2) = q2} (“synchronous motions m in Vs”). 
Consider the function on MV: 


t2 £ 
S(m) = | (do rilt), a) — H(p(t), q(t), t)) dt. (3.11.50) 


With the methods of §2.2.1 and §3.4 by now familiar, one checks that the 
stationarity condition for S on m in MY is simply that the motion m verifies 
the Hamiltonian equations in V with Hamiltonian function H [which are, 
essentially, the Euler-Lagrange equations for the action of Eq. (3.11.50). 

Now let t > p(t) = (w(t), r(t), t) = Cr(p(t), a(t), t), t € [t1, t2] be the 
image motion of a motion m € MY: it is a motion in 


Cr(pi, 41, t1)M™ = Cr(MY) = Mat (Cr(P1, qı, t1), Cr (P2, G2, t2); W) 
If u verifies the Hamiltonian equations for some Hamiltonian H’ on W in 
m € MW, it must make the action 


te £ 
SG =| {Y milt kilt) — H' (a(t), w(t), t)} dt (3.11.51) 
to j=l 
stationary. A sufficient condition for this to occur is that 


S(m) = © (Cr(m)) + constant, vm € MY, (3.11.52) 


of course. Equation (3.11.52) is certainly verified if the differential form on V: 
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£ 
pida; — H (p,q, t) dt (3.11.53) 
i=l 
and the differential form 
£ 
X ridri — H' (m,n, t)dt (3.11.54) 
gmt 


are transformed into each other by the transformation Cr, up to a total differ- 
ential. This condition can be imposed by requiring the existence of a function 
G on W such that 


£ £ 
X pidqi — H(p, q, t) dt = X` midri — H' (7, k, t)dt + dG (3.11.55) 
t=1 


i=1 


where (p,q,t) are to be thought of as functions of (m, «,t) via the transfor- 
mation Cr. 

To use Eq. (3.11.55), it is more convenient to think of G as a func- 
tion of (q,«,t) instead of (7,«,t) via Eq. (3.11.41); i.e., set G(q, k, t) = 
G(E(q, k,t),«,t). Then it follows from Eq. (3.11.55) that 


£ £ 
dG = X` pidqi — X` mids; — (H — H’)dt, (3.11.56) 
i=1 i=1 


so we realize that Eq. (3.11.56) holds if and only if there is a function G which 
is such that, Vi=1,...,2, 

ðG ðG aG 

i= i=— ; H — H' = 

R Od; : g OR; Ot 

thinking the coefficients of the right-hand side differentials in Eq. (3.11.56) 

as functions of q, «x,t, via Eq. (3.11.41). Such relations are satisfied by the 

function F, setting G = F. mbe 


(3.11.57) 


Observations. Subtracting the differential dy piqi) from both sides of Eq. 
(3.11.56) and thinking of 


£ 
W=F-S pig (3.11.58) 
t=1 


as a function of p,«,t via Eq. (3.11.41)? one finds that the transformation 
Cr may also be thought of as Cy described by Eq. (3.11.47). Similarly, setting 


20 assuming that the necessary inversions can actually be made. 
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£ 
=F +S mide; (3.11.59) 
i=1 
and thinking of ® as a function of (q, m, t) via Eq. (3.11.41)? one finds that 
Cr may also be thought of as Cg described by Eq. (3.11.44). Finally, setting 


4 £ 
R=F +) > midri— X piqi (3.11.60) 
i=1 i=1 
and thinking of R as a function of (p, m, t) via Eq. (3.11.41),? one finds that 
the Cr may also be thought of as Cr described by Eq. (3.11.49). 

In the problems at the end of this section, it will appear that the inversions 
mentioned above (see footnote 20) can be performed at least in small regions 
under the respective conditions that the matrices woh. aah. ah have 
non vanishing determinants. 

This somewhat clarifies Observation (3) to Proposition 21. A complete 
clarification arises from the analysis of Problems (6)-(11) at the end of this 
section. The reader should try to think of these observations again after look- 
ing at the problems. 

A simple corollary to the proof of Proposition 21 is the following. 


22 Proposition. Let (n,k) — C(a,k) be a nonsingular invertible C° map 
of the open set V C R” or RE x (R x T®), L + fo = £, onto W C R” or 
RE x (R& x T2), +65 =L. Write C explicitly as 


p=P(7,6), q=Q(z,k) (3.11.61) 
and consider the differential form on V, 


£ 
T -dr — p: dq = X (midri — pidqi) (3.11.62) 


i=1 
Write it as — X; (Xidr; + Yıdri) with 


108 
On; 


(7,6) —7;. (3.11.63) 


Suppose that the form in Eq. (3.11.62) is exact: i.e. Vi,j =1,...,2 


OX; OX; OX; _ OY; OY; _ OY; 
On; g On; , OK; K On,’ OK; = OK,” 
Then C is a completely canonical time-independent map. 
In particular, if m - dk — p - dq = 0 the map C is completely canonical: it is 
called “homogeneous” in the variables (K, q). 
Similar results hold if pdq+«- fr, or -—q:dp+«-dz, or-—q:dp—7-dk 


(3.11.64) 
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are exact differentials: one similarly defined the homogeneous canonical maps 
with respect to (q, m), or (p, T), or (p,&). 


Observations. 

(1) If C is as above and homogeneous in (K,q) variables, then it cannot gen- 
erated by a generating function F'(«,q) as in Eq. (3.11.41). The vanishing of 
the differential in Eq. (3.11.62) and the Eqs. (3.11.41), (3.11.42) written as 


dF =7-dk—p-dq+(H' — H)dt (3.11.65) 


imply that dF = (H’— H)dt, i.e. H’ — H is a function of t only and so is F as 
well, so that Eq. (3.11.41) gives m = 0, p = 0 which is obviously not usable 
to define an invertible map between q, p and K, 7. 

(2) If C is homogeneous as in Observation (1), it might be generated by func- 
tions ®(7,q) or V(K,p) or R(n, p): for instance, the map p = az,q =a 'k 
is homogeneous in (q,«) variables (as pdq = mdr) and it is generated by 
W(p,«) = a-"pK! 

(3) A very interesting homogeneous canonical mapping is met in the theory 
of the motion of a rigid body (see Problems to §4.11). 


PROOF. If 7 - dk — p- dq is an exact differential, one sees, by going through 
the proof of Proposition 11, that Eq. (3.11.56) can be satisfied by choosing 
H=A"'. mbe 


Observations. 

(1) From the proof of Proposition 22 and from Eqs. (3.11.41), (3.11.44), 
(3.11.47), and (3.11.49), we see that a sufficient condition in order that any 
Hamiltonian H on V is conjugated to a Hamiltonian H’ on W given by 


H' (a, «,t) = H(C7'(a, «, t)) (3.11.66) 


is that the transformation C mapping V onto W be generated by a time- 
independent function F or ® or W or R or be homogeneous in the sense of 
Proposition 22. 
(2) The interest in canonical transformations consists of the fact that some- 
times it is possible to solve the Hamiltonian equations by finding a canonical 
transformation transforming the system of Hamiltonian equations into a con- 
jugate system with “trivial” Hamiltonian H’, i.e., trivially soluble (e.g., H’ = 0 
or H'(m, k) = h(k) which yield trivial Hamiltonian equations, indeed). 

A concrete method to look for such a transformation (“Hamilton-Jacobi 
method” ) consists in trying to find, using Proposition 21, a function F defined 
in a suitable neighborhood 2 C R?*! such that, V(q, K, t), 


(q, 4, t),q,t) + OF (asts,t) =0 (3.11.67) 


OF 
a 
HH’ = H( a 


oq 


or, for some h, 


226 3 Systems with Many Degrees of Freedom 
OF OF 
q 


Equations (3.11.67) and (3.11.68) are to be considered as equations in which 
kK is a parameter and, therefore, as partial differential equations for a function 


(a, t) > f(a, t): 


o o 
nlan att 2a.) =0 (3.11.69) 
or 
nla t),q,t) + Jha, t) = constant (3.11.70) 


(“Hamilton-Jacobi” equations). We wish to find solutions to Eq. (3.11.69) or 
Eq. (3.11.70) which depend on £ parameters Kk = (K,..., 6). 

If we were able to find such a family, i.e., if we were able to find a C% 
solution F of Eq. (3.11.69) or Eq. (3.11.70) depending on (q,«,t) E Q= { 
some open set in R?“+!}, we could consider the transformation (3.11.41) and 
hope that it defines a canonical map Cr of some open set V CV into a set 
W: the transformation Cr would then transform the Hamiltonian equations 
associated with H into trivial Hamiltonian equations in W, with Hamiltonian 
function 0 or h(«). 

However, it is obvious that the difficulty of solving Eqs. (3.11.67) and 
(3.11.68) in the above sense is equivalent to or harder than solving the origi- 
nal Hamiltonian equations, and one should not think of Eq. (3.11.67) or Eq. 
(3.11.68) as a miraculous equation. 

The usefulness of the above discussion on Hamilton-Jacobi equations con- 
sists of the possibility of finding approximation algorithms to the solutions to 
Eq. (3.11.67) or Eq. (3.11.68) and, therefore, to the original Hamiltonian equa- 
tions, which are essentially different from the general recursive method seen 
in §2.3, valid for solving the most general first-order differential equations. 

The methods devised to construct recursively successive approximations to 
Eq. (3.11.67) or Eq. (3.11.68) are methods in which the particular structure of 
the Hamiltonian equations is explicitly used. It is therefore not too surprising 
that they reveal themselves to be quite appropriate to the analysis of such 
equations and provide better approximations for a given amount of formal 
work done. 

The reader can convince himself of the truth of the above statement only 
by seeing some concrete problems studied on the basis of approximation al- 
gorithms to the solutions of the Hamilton-Jacobi equations. The best known 
and most celebrated of these methods or some of its variants can be found 
in the theory of the motion of heavenly bodies and, more generally, in the 
stability theory of the motion of conservative systems. An important example 
will be illustrated in §5.9-§5.12. Some “trivial” examples can be found in the 
upcoming problems. 
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3.11.1 Exercises, Problems and Complements 


1. Construct the canonical transformation with generating function f(q,«) = Bug? tan k, 
qER, a € T1, and note that the above transformation simplifies the Hamiltonian H (p, q) = 


p? 
$z 4-2 o? = q°. Find the harmonic oscillator motion with the help of this transformation. 


2. Consider a one-dimensional mechanical system consisting of a point with mass m subject 
to a force with potential energy V E€ C™(R). Assume that V(0) = 0, V’(q) # 0 if q # 0, 
V(q) uor? Te Consider the canonical map (p,q) — (£,7) with generating function, 


see p. 222, f(E,q) = fi V2M(E — V(q')) dq’ near a point (p,q) where z +V(q) > 0 


Write it explicitly, finding the R in the new coordinates and the physical inter- 
pretation of the 7 and E coordinates. (Hint: Do not try to “compute” the integral, but 
rather perform the necessary differentiations on the integral and then use the formulae for 
the one-dimensional motions found in §2.7). 


2 
3. Interpret f defined in Problem 2 as a solution of the equation ($f) +V(q)=E 


and interpret this as a one-parameter solution of the Hamilton-Jacobi equation for the 
mechanical system in Problem 2, in the sense of Eq. (3.11.67), of the form f(E,q) — Et (or 
in the sense of Eq. (3.11.68), of the form f(E,q) with h(k) = E). 


4. In the context of Problem 2, define, for E > 0, 


cea q+ (E) = i 
Soe) dq’ 2(E T V(q')) 
where q+ (E) are the roots of E—V (q) = 0. For E > 0, let a( = Sin ty 5 dE! and let A > 


e(A) be its inverse function (such that e(a(£)) = E). Bees the nomic transformation 
(p,q) — (A, p) with generating function S(A,q) = J \/ 2m(e(A) — V (q')) dq’ near some 
(p,q) # (0,0). 

Compute the new Hamiltonian and show that the canonical map may be extended to a 


Saran map of R?/(0,0) into (0, pig ae x T1. (Hint: Let y = 2S (A, q) mod 27, 


p= gs S(A,q) and show that this is a om map between the indicated sets.) 


5. Show that the transformation in Problem 4 is a natural generalization of the Cartesian- 
polar coordinates in the plane (Hint: Consider the special case (p? + q?)/2, where it gives 
exactly the Cartesian-polar coordinates. Draw the curves A = const and compare them 
with the circles.) 

The angle defined in Problem 4 is called the “average anomaly” and, therefore, the time 
evolution of the average anomaly is always a uniform rotation. 


6. Let A, B,C be £x £ matrices and A,C be symmetric. Define on Re 
1 1 
F(q, k,t) der z^a: qt JEE k+ Br-q. 


Show that if det B 4 0 the map Cr is well defined and completely canonical between R2¢+! 
and itself. Show that its Jacobian determinant is 1, at least in the case £ = 1 (the case 
£ > 1 is discussed in §3.12). (Hint: First deal in detail with the case £ = 1 when A, B,C are 
simply numbers.) 


7. In the context of Problem 6, show that det B Æ 0 is a necessary and sufficient condition for 
Cr to be defined. Hence, F(q, K, t) = (kK? +q?) /2 does not define a canonical transformation. 


8. Let F be as in Problem 6. Construct explicitly the other generating functions for the 
canonical map [Eqs. (3.11.58), (3.11.59), and (3.11.60)] and check that, via Eqs. (3.11.44), 
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(3.11.46), and (3.11.49), they all generate the same completely canonical transformation 
if det A, det B, det C # 0. Check that all the inversions mentioned in connection with the 
quoted formulae can actually be performed, in the present situation. 


9.* Let F be as in Proposition 21. Let (po, qo, to), (To, Ko, to) be two points related by Eq. 
(3.11.41). Define the £ x £ matrices 


OF OF OF 
ij = —— (qo, Ko, to), Bij = —— (qo, Ko, to), Cij = —— (qo, Ko, to). 
eran (do, Ko, to) t= eoi (qo, Ko, to), Cij Dein) (ao, Ko, to) 
Show that if det B Æ 0 then the map Cr is defined in a neighborhood of (po, qo, to) (Hint: 
Use Problem 6 and apply the implicit function theorem to take into account that F no 
longer has constant second derivatives as in the cases of Problems 6 and 8.) 


10.* Is it possible that Cp exists near (po, qo, to) in the context of Problem 9 when det B = 
0? (Answer: No; hence, F'(q,«,t) = f(q) + 9(#) does not define a canonical transformation. 
Check this directly.) 


11. Show that the invertibility properties of the matrices A, B, C mentioned in connection 
with the quotation of Eqs. (3.11.58), (3.11.59), and (3.11.60) in Problem 8 are necessary, 
in general, in order to be able to express Cp as Co, Cw or Cr. (Hint: Consider, for £ = 1, 
F(q, K) = qr: this is a case where A = C = 0 and the inversion cannot be realized. In this 
case, it is impossible to generate the corresponding canonical transformation with a function 
W(p, «,t), since the transformation is easily checked to be homogeneous with respect to (Kk, p) 
as qdp + mdk = 0. See Proposition 22, p.224, and the subsequent Observation (1). Similar 
considerations hold for k? + «q, as qdp + ndk = —Kdk, which is equally impossible for 
reasons similar to those used in observation (1) to Proposition 22.) 


12. Consider x € Mt ,tə (€1, €2) and y € Vx. Call the variation y “nontrivial” if t — z(t) = 
Rit, 0), t € [t1, t2], is such that z # 0. Define x to be a “strict local minimum” for the 
action A relative to M if for every variation y E€ Vx(M) which is nontrivial, there exists ny > 
0 such that A(y-) > A(x), Vle| < n, or if Vx(M) only contains trivial variations. Examine 
the proof of Proposition 38, §2.24.1, 132, to show that in the statements of Proposition 38, 
§2.24.1, Proposition 6, §3.3, p.152, Proposition 8, §3.5, p.163, one can replace the words 
“local minimum” by “strict local minimum” (Hint: Just look at the proof of Proposition 
38, p. 132, and Eq. (2.24.33).) 


13. Let t — x(t), t € [0,T], be a motion verifying the equations associated with the 
Lagrangian (3.11.23) and taking place in the open set Uo C R. Let E be the energy of x [see 
Eq. (3.11.33)]. Consider x for t € [t1,t2] C [0, T] and fix t1: sox E Mz ,t.(x(t1), x(t2); E) = 
{space of the motions in Mz ,t. (x(t), x(t2)) taking place in Up and with energy E }. 
Show that x makes stationary and (if t2 is close enough to t1) strictly locally minimal (see 
Problem 12) the action 
t 
A= | TOR 

ti 
in Mt tə (x(t1), x(t2); Æ). (Hint: Simply note that if A is stationary or strictly locally 
minimal on x in Mz ,tə(x(t1), x(t2)) it is such in any M C Mt ,ta(x(t1), x(t2)). Then 
remark that A(x’) = Ka (T(t)—V (t)) dt = A(x’)— FE (t2—t1) if x’ © Mi t (x(t1), x(t2); E), 
as T(t)+V(t) = E.) 


14. Show through examples that it is possible that the set M+ ,t.(x(t1), x(t2); E) consid- 
ered in Problem (13) contains finitely many points (hence Vx(M¢, tə (x(t1), x(t2); Z)) only 
contains trivial variations). Nevertheless, even in such cases the statement of Problem 12 is 
not an empty one: for instance, deduce from Problem 12 that the free motion in R4 takes 
places along straight lines. (Hint: Let t — x(t) be a free motion in R4, then T(t) = 4x(t)?. 
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If t + x(t) were not a straight line, then Vx(Mt ‚ta (x(t1), x(t2); Z)) would not consist only 
of trivial variations: however, A(x’) = E (t2 — t1) for all x’ € Mt ,t.(x(t1), x(t2); E) and, 
therefore, x could not be a strict local minimum!) 


15. On R47, consider the metric associated with the scalar product of Eq. (3.7.1) and let 
dé be the line element of the curve in RN@ with equations t — x(t), t € [ti, t2], which is 
a motion of energy E of N points, with masses mj ,...,my > 0, under the influence of a 
force with potential energy V € C®(RN4). Show that 


AWE E VIE VEON d- E(t2-t1) if de= /2T@dt, 


where A(x) = ie (T(t) — V (t)) dt and £ — (£) is the description of the trajectory of x in 
terms of the curvilinear abscissa £ on it and the integral Jee is the curvilinear integral on 
the trajectory. 


16. Consider N points in RI, with masses m1,,...,my. Assume that such a system is 
subject to an active force with potential energy V‘® and to an ideal holonomous constraint 
to a regular ¢-dimensional surface X C RN@. On X consider two points €1,€2 and the set, 
Mo,1(&1, §2|2), of the C° curves joining €; to 2 and parameterized by some parameter 
T varying between 0 and 1. Given E € R, define on Mo,1 (€1, €2|’) the curvilinear integral 
on the curve X € Mo,1(€1, €2|2’) as 


&2 
S3) = 8) f VÆVE) ds, 


where ds is the line element on E, measured with the kinetic energy metric ds? = 
DA mi (dx)?, Eq. (3.7.1). 

Show that the least-action principle implies that S is stationary on the curve X if and 
only if X is a trajectory of a motion with energy E leaving €1, and reaching Ẹ2, (“Mau- 
pertuis’ principle”). (Hint: Consider a local system of local coordinates near X permit- 
ting representation of the points of X N U through some parametric equations € = x(a), 
a = (a1,...,al) E€ R C R*. Suppose, for simplicity, that X(t) C UN X, Vt € [t1, t2]. Assume 
L to be a Lagrangian of the form of Eq. (3.11.23) describing the system in these coordinates. 
Write the stationarity conditions of S in Mo,1(€1; &2|7) for the curve R with parametric 
equations T — A(T), 7 € [0,1], in the chosen coordinates. Then, in the resulting Euler- 
Lagrange equations, perform the change of coordinates T<-+t: 


it E$ jai Jij (a(0))a4 (0)a4 (0) PA 
0 


2(E — V (x(a(8)))) 


where the prime denotes differentiation with respect to 7 or 0. One finds that the motion 
t — x(a(r(t))) has energy E and verifies the Lagrangian equations for £. 

A more interesting alternative proof: consider a variation T —> E(r) of the path X in 
Mo,1(&1, €2|&’) and imagine it run at constant energy E chosen so that starting at time 
tı in €; the point €2 is reached at time t2 (E will differ by some E from E): call x(t) 
this motion which is a variation of R in M¢, to (€1, €2|5’) and remark that its time law is 


determined by t — tı = JEn AED Then by Problem (13) the action of X is 
-V(X 


AGN 2 f° F(t)dt — Blt2 —t1) = 1 STAT (t)at — Elta — t1) 


and the latter expression can be written (X) JE V24/ E — V(X(s))ds — E(tg — t1). Thus 
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£2 ds 
OER 
1 V2(E— V(x(s))) 


hence stationarity of S on a curve in Mo,1(€1,€2|’) implies stationarity of A on the 
corresponding motion running on the curve with energy E.) 


5A =V35S+ | (t2 — t1) 8E = V26S 


17. In the context of Problem 16, show that the Maupertuis’ principle can be interpreted 
as saying that the motions developing on X with energy E take place along the geodesics 
of X with respect to the metric on X: 


dh = \/2(E — V(@) ds, 


where ds the kinetic energy metric on X. (We recall that by definition, a curve on X is 
called a geodesic for a given line element on X if it makes stationary the distance between 
any two of its points measured along the curve itself using the given line element.) 

In other words, if we call the distance between two points €1,€2 € X measured with the line 
element dh along a given curve on X, with the name “mechanical path” with energy E on 
X, we can say that the “motions with energy E on X take place along trajectories making 
stationary the mechanical path with energy E”. As usual, it is possible to show that the 
mechanical systems with Lagrangian Eq. (3.11.23) have the property that a trajectory of 
any of their motions with energy E, taking place on X, not only makes the mechanical path 
stationary but actually strictly minimizes it on short enough segments. 


18.* Under the assumptions of Problem 17, let s + X(s), s € [s1, s2], be a geodesic segment 
on X for the line element dh. Suppose that E — V(X(s)) > 0,Vs € [s1,8]. Show that there 
is 5 > s1, such that if s2 € [s1,3], the curve s — X(s), s € [s1, s2], makes strictly locally 
minimal the mechanical path with energy E between X(s1) and X(s2). 


19. A point with mass m > 0 is bound to a surface X C R3 by an ideal constraint and 
it is subject to no other forces. Show that as a consequence of the Maupertuis principle, 
Problems 16-18, the point runs on X in such a way that if two points on its trajectory are 
close enough, then the trajectory itself is the one minimizing the distance on X between the 
two points, i.e., the trajectory is the shortest path on X joining the two points, the distance 
being measured in the ordinary R? sense (“geodesics” or Fermat’s principle”). (Hint: Note 
that dh and ds are now proportional, and use Problem 18.) 


20. Consider the line segment (dx? + dy?)/y? defined on the half-plane y > 0. Determine 
its geodesics by thinking of them via the mechanical interpretation, permitted by Problem 
16, which allows us to regard them as the zero energy motions of the mechanical system 
with Lagrangian £ = 4 («? +4?) + yr 


21. Calling the geodesics of the Problem 20 “straight lines for the geometry (“Lobachevski 
geometry” or “noneucidean geometry”) defined by the line element ds”, check the truth or 
the falsity of the following statements: 

(i) Given two points in the half-plane y > 0, there is one and only one straight fine through 
them. 

(ii) Two points in the y > 0 region are joined by just one straight line segment (if a straight 
line segment is defined as a connected closed subset of a straight line). 

(iii) Given a point, and a straight line not containing it, there exists just one straight line 
containing the point and “parallel” to the first straight line (i.e., without common points 
with it). 


22. Same as Problems 20 and 21 for the geometries associated with the following line 
elements: 
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(i) ds? = (a? + y”)(dx? + dy”), (x,y) € R*\0; 
(ii) ds? = (1 — x? — y?)*(du? +dy”), (a, y) E€ R?, £? +y? <1,a €R; 
dz? + dy? 
(iii) 1 = eT ay (x,y) € R2\0. 


23. Same as Problems 20 and 21 for the geometry defined on a sphere by the line element 
induced by the Euclidean distance of R£; i.e., ds? = d0? + sin? Ody? in polar coordinates. 


24. Consider the geometry defined in the half-plane y > 0 by the line element of Problem 20. 
Define a “triangle” as a figure formed by the three points pairwise connected by geodesic 
segments. Given a triangle, denote a,(,y the three angles relative to its three vertices 
(defined as the angles between the tangents to the two geodesic segments meeting at the 
various vertices). The quantity a+ 8 + y — m is called the “geodesic defect”: show that it 
is < 0. Show that the same quantity computed in the analogous situation for the sphere’s 
geometry of Problem 23 is > 0. 


25. A light ray moves in a plane strip x € R,|y| < 1 with refraction index 


n(x, y) = V1 —ey?, e<l. 


Using Fermat’s principle, show that the ray proceeds along a sinusoidal path, if it is assumed 
that the ray starts at the origin with an initial direction close to the horizontal. Recall, for 
this purpose, that Fermat’s principle says that the rays follow a path that makes stationary 
the “optical path” between any two of its points, within the set of the paths joining them. 
The optical path, in a medium with index of refraction n(x, y), associated with the curve 


xX € Mo,1(E1, €2), is 


£ 
a f ? n(x, y)ds, ds = V dx? + dy?. 
41 


(Hint: Interpret the above problem as a mechanical problem via Problems 16 and 17.) 


Hence, via Maupertuis’ principle, the problem of the determination of a light path can 
be interpreted as a purely mechanical problem. 


26. Solve the problems at the end of §18, §20, §21, §24, §32,§39,844 in [28]. 


27. Perform the the Legendre transformation on the Lagrangian L = yt? + y? and explain 
why one gets strange results. 


28. Consider the function on R” : £L(q,q) = 3 Aq “qt 4Cq -q + Bå- q where A,C are 
Lx £ symmetric matrices and B is an £x £ matrix. Under which conditions on A, B,C is £ a 
regular Lagrangian on R2“? In these cases, write the corresponding Hamiltonian function. 
Similarly, consider the function on R: H(p,q) = + Ap -p+ 5Cq: q+ Bp-q and find the 
conditions for H to be a regular Hamiltonian and write the corresponding Lagrangian. 


29. In the cases when the Lagrangian in Problem 28 is regular, write the energy conservation 
theorem, Proposition 18, §3.11, in terms of q and q. (Hint: H = p-q— L(q,q), and then 
express p in terms of q,q and use Proposition 18.) 


30. Show that the time-independent completely canonical linear transformations of R? 
onto R?! form a group Se, under the natural composition law. 


31. The set G of the linear completely canonical transformations on R? with generating 
functions @®(7,q) = san? | 40g? + brq, b # 0, which we denote (a,c,b), does not form 


a subgroup of Sı. Prove this by finding the composition law of (a,c,b) and (a’,c’,b’). 
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Show that (a,c,b) - (a’,c’,b’) € G if and only if a'c # 1. (Hint: The composition law, if 
ô =ac—0b?,5' =a'c — 6, is 


— a’ —c! 5! bb! 
(a,c, b) - (a’, c',b’) = (; a e+e z)» 


3 3 
l-a’ec 1-ace 1-a 


32. Same as Problem 31 for the class G’ of the canonical transformations generated by 


functions F'(K,q) = tag? | $ern? + bqk, b 0. (Hint: The composition law is now, for 


suitable 6, 5’ 


Fd a0 6’ Fo ô —bb’ 
CECE ae (= 2 B 


? ei 
at+c atc ated 


33. Find a generating function for the transformation (m, K) — (p,q) defined by p = 
Rr, q= (R™)~'k, where R is a nonsingular ¢ x £ matrix: this transformation is completely 
canonical. (Hint: Look for a generating function like ®(7, q) = Br -q with B being an £x £ 
matrix.) 


34. Let k > q = f(x) be an invertible nonsingular transformation of R° onto itself. Find 
out how to define p = F(z, k) so that the map (7, k) > (p,q) = (F(a, «),f(«)) will be 
completely canonical. (Answer: If Ri;(K) = Bhi (K), then p = (R(K)T) tr.) 

J 


35. Let f € C~@(R*) be multi periodic with periods 27. Is the function A’. p+ f(w) a 
generating function of a canonical map of Rf x T! onto R! x T£? Find a sufficient condition. 


36. Let (A’,~) — f(A’, p) be a C®(R?£) function multi periodic with periods 27 in the 
y’s. Suppose that the transformation 


of 
OA! 


establishes a nonsingular invertible map of T® onto itself for each A’ € R%. Suppose, also, 
that the transformation 


(A’,~) mod 27 


gp =pt 


A=A'+ LA) 


establishes a nonsingular invertible map of R! onto itself for each p € T®. 
Show that the function (A’,~) = A’ -p + f(A’, p) generates a completely canonical map 
of R x T! onto itself. 


37. Find a “local version” of Problem 36 when R x T® is replaced by V x 7%, V C Rf 
open. 


38. Consider the maps (p,q) > C(p,q) = (A,) and (p,q) > C(p,q) = (B,¢) of 
R?/0—R+ x T! defined by 


1 
y = polar angle of (p,q), A= 5 +°), B=VA. 


Show that while C is completely canonical, the map Č is such that the Hamiltonians H = 
4 (p? +q?) and H’ = B are canonically conjugated by it, but the Hamiltonian 4p? has no 
canonically conjugated Hamiltonian with respect to C. (Hint: C is studied in Problems 1 and 
2. Show that a general measure-preserving flow on R?/0 is not mapped by C into a measure- 
preserving flow on Ry x T£: the evolution associated with H’ = 5p? is actually mapped 
by Č into a non-measure preserving one. So the image flow cannot be a Hamiltonian flow 
since the latter would, instead, preserve the measure, by the Liouville theorem, Proposition 
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19, in the case of a time-independent Hamiltonian (or by an extension of Proposition 19 in 
the time-dependent case; see Problem 39).) 


39. Extend Proposition 19, §3.11 to the case of time-dependent Hamiltonian equations. 
(Hint: Replace the semigroup property StSy = Sty used in the proof of proposition 19 
(see Problem 10, §2.24) by the more general relation S(t, t’)- S(t’, to) = S(t, to), t > t' > to, 
where S(t, t’) denotes the solution map of the non autonomous Hamiltonian equations when 
the initial data are assigned at t’. The proof proceeds unchanged.) 


40. Let (q, t) — S(q,t) be defined and C© on a set U x J, U C Rf, J C R, both open, 
and connected. Let H be a regular Hamiltonian on V = R! x U x J and suppose that S is 
a solution to the Hamilton-Jacobi equation 


Os Os 
A(— t t — t)=0. 
at )qa t) + Jt (q,t) =0 


Consider the differential equation for t > q(t), 


OH ðS 
= —(—(q,t t to) = 
q ə (5 (q,t), 4, J; a(to) = qo 


and suppose that for all (qo, to) € U x J, one can solve it for t near tg by t > q(t). Show 
that setting 


_ as 


p(t) = Ja 


(a(t), t), 


the functions t — (p(t), q(t)) are solutions to the Hamiltonian equations with initial data 
Os 
a(to) =o, p(to) = 53 (ao, to); 
q 
i.e., “every solution to the Hamilton-Jacobi equation provides a bundle of solutions to the 


corresponding Hamilton equation”. (Hint: Check it directly by substitution.) 


3.12 Completely Canonical Transformations: Their 
Structure 


Among the canonical transformations, the completely canonical transforma- 
tions are very simple and interesting [see Eq. (3.11.39)]. It is therefore impor- 
tant to obtain general results about the structure of such maps. 

Let V C RE x RE or RE x T! or RE x (R" x T®), 4L +4 = l, be an 
open set which is regarded as the phase space of a Hamiltonian systems of 
differential equations with regular Hamiltonian functions H € C%(V) (see 
Observation (4) to Proposition 17, p.216.) 


18 Definition. Let V,W be open sets as above and let C be an invertible 
nonsingular? C® map between V and W. Denote C 


p = P(z,&), q = Q(z, kK) (3.12.1) 


21: 7 PEE ; x 
i.e., with non vanishing Jacobian determinant. 
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Let (Po, qo) = C(70, Ko), (Po, do) E€ V, (To, ko) E W be two C-corresponding 
points. Define the “linearized C map near (To, ko)” as the map of RE x RE 


def 
= po + A(n — Tmo) + B(K — Ko), 
ae Poe ae Re (3.12.2) 
q = qo + C(m — To) + D(K — Ko), 
where A,B,C and D are the four £ x £ matrices: 
def OP; def OP; 
Aij gi 7 (To, Ko), Bij gl (To, Ko), 
F m (3.12.3) 
er a Dy aii) a 
ij On; 0,0), ij OK; 0,0); 


i,j =1...,2, and in Eq (8.12.2), To, Ko, Po, qo are regarded as elements of 
R! (even though ko, qo might be in TE or R® x T).22 The x 2l matrix 


oe é a) (3.12.4) 


is the Jacobian matriz of the map C and, therefore, det L 4 0, Y(To, ko) € W. 


The main structure theorem for the completely canonical time independent 
maps transforming V onto W is: 


23 Proposition. A necessary and sufficient condition for the complete canon- 
icity of a map C of the type considered in the Definition 18, Eq. (8.12.1), is 
that the map obtained by linearizing C at (To, ko) E W is a completely canon- 
ical map of R” onto R”, V(mo, ko) E W. 

This is the case if and only if the inverse matrix to the matrix (3.12.4) is 


ze DF -BT 
Bot (Jon AT. ) (3.12.5) 


where the superscript T denotes the transposition of the matriz. 


Observations. 

(1) In other words, C is completely canonical in W if and only if its lineariza- 
tion around any point in W is completely canonical. 

(2) Hence, complete canonicity is a “purely local” property of a map: this 
explains why the completely canonical maps are sometimes called “contact 
transformations” (although it does not explain why they are often called “sym- 
plectic” ). 


PROOF. Let H € C™®(V) and H' (n, r) = H(C(a,«)) = H(P(T, K), Q(7,4)). 
The Hamiltonian equations in V are 


22 We recall that on T we use the flat coordinates: the ambiguity mod 27 of some of the 
coordinates of Ko or qo is arbitrarily solved here and it is irrelevant in the following. 


3.12 Completely Canonical Transformations 235 


OH OH 
i ae J= — 3.12.6 
Peca (p,q), å Ip (p,a) ( ) 
and if C is completely canonical, they must be equivalent to the equations 
H' H' 
r= Lag k= 2 (7, K) (3.12.7) 


On 
3.12.7), then the motion t > C (r(t), K(t)) = 
,q(t)) has to solve Eq. (3.12.6). Differ- 
qilm (t), k(t)) with respect to t, 


i.e., ift > (m(t), k(t)) solves Eq. 


(P(m(t), «(t)), Ql 
entiating p;(t) = P; 


F Pi (3.12.8) 
OH' OH' 
li = Cini + Dikkk) = — Ci + D; i 
q > ( kk kk) 2 ( k OK k | 


for i = 1,...,¢, where the matrices A, B,C, D and the derivatives of H’ are 
evaluated at (a(t), «(t)), to simplify the notations. Using the expression of 
H' in terms of H, we find from Eq. (3.12.8), Vi = 1,... £: 


oe a das dp, | ðq; 
=D {(- CHG Ba + E pu) Da Ê Au +57 ca))}, 


(3.12.9) 
where the derivatives of H are computed in the point (P(7, K), Q(7,«)), and 
the matrices A, B,C, and D have to be computed in (7, K). Equation (3.12.9) 
can be more compactly written with matrix-product notations: 


One a aes 


We now impose that Eq. (3.12.10) reduces to Eq. (3.12.6), VH € C™(V). 
Since the vector in the right-hand side of Eq. (3.12.10) can be made arbitrary 
by varying H, if the point (p,q) where the derivatives are evaluated is kept 
fixed, it follows that 


ABT — BAT =0, CD?-DCT=0, -ADT + BCT =-—I, (3.12.11) 


where I = (£ x £ identity matrix). Note that 
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(ABT —- BAT) (-AD?+BCT)\ (A B Be -DT 
(DAT +CBT) (-CD?+DCT)} \C DJ\-AT ct 
(3.12.12) 
hence, Eq. (3.12.11) can be written as 
A B Br —pf 0 =f 
(A By (BT BTA 7 aan 
A, ; ; 0 —I 
or, multiplying both sides on the right by e 0 ); 
A B DP —Bt I 0 
(A BY(PE F-E) e 


which implies Eq. (3.12.5). 

Vice versa, if Eq. (3.12.5) holds everywhere in W, the above equalities can 
be run backwards. 

If H is explicitly time dependent, its conjugacy via C with H’ defined by 
H'(x,«,t) = H(C(a,«),t) follows in an identical fashion. mbe 


24 Proposition. The Jacobian determinant of any completely canonical 
transformation is +1. 


PROOF. Equation (3.12.14) can be written 


(é p) a 7) (jr pr) & a =-1 (3.12.15) 


where 1 denotes the 27 x 2¢ identity matrix; i.e., 


0 -I r0 -I\ _ 
a(t; 0 jz e 0) =r. (3.12.16) 
Hence, taking the determinant of both sides and remarking that the matrix 
B= G 0) has determinant det E = +1, it follows that 
(det L)? =1 (3.12.17) 
mbe 


It could be shown that, actually, det L = +1, see problem (16) below. 

The conditions (3.12.15) or (3.12.11) for complete canonicity, equivalent 
to Eq. (3.12.5), can be expressed in terms of the following notion of “Poisson 
bracket” of two observables. 


19 Definition. Let V be an open subset of R°ll x RE or RE x TE or Rf x 
(R& x T”), ell; + l2 = £, regarded as the phase space for the Hamiltonian 
equations in V. Let F,G € C™(V) be two “observables”. One defines the 
“Poisson bracket” {F,G} € C™(V) of F and G as 
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OF OG OF OG 
{F,G}(p,q) = È (edged - edgy ®d) (3.12.18) 
Observations. 
(1) Clearly, Vi, j = 1,...,4, 


Also, if %1,..., Yr are C” functions on R” and F;,..., Fn E€ C (V), and if 
one defines 


&;(p,q) = 9;(Fi(p,q),---, Fn(p,q)), (3.12.20) 
one finds: 
“. Oy; Ov; 
{5;,5;} = T a (Fe Fe}. (3.12.21) 
h,k=1 


(2) Sometimes the definition (3.12.18) is given with the opposite sign: this is 
totally irrelevant despite claims to the contrary. 
(3) Equations (3.12.19) are also called the “canonical commutation” relations. 


The notion of Poisson bracket is remarkable as it appears from the follow- 
ing corollary to the Proposition 23, p.234. 


25 Corollary. A necessary and sufficient condition for the complete canon- 
icity of an invertible nonsingular map C between V and W (as in Definition 
17, p.220, above) is that the functions defining it, P(m, K), Q(T, K), have the 
property, Y(n, k) E€ W, Vi,j =1,...,8: 


Observations. 

(1) So C is completely canonical if and only if it “preserves the canonical com- 
mutation relations”. 

(2) If C preserves the canonical commutation relations, it follows that it pre- 
serves the Poisson brackets of any pair of observables: this means that if 
F,G € C™(V) and if we define 


Fe(a,«) = F(C(x,&)), Gec(m,K) = G(C(a,&)), (3.12.23) 
then, as is checked by Eqs. (3.12.21) and (3.12.22): 
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{F,G}(p,q) ={Fe,Ge}(m,«) if (p,q) =C(z,K). (3.12.24) 


So C is completely canonical if and only if it “preserves the commutation 
relations of any pair of observables”. 


PROOF. Explicitly write Eq. (3.12.22) in terms of the derivatives of Eq. 
(3.12.3): one finds that they become Eq. (3.12.11), i.e., Eq. (3.12.5). 
mbe 


In §3.11 it has been shown that a class of completely canonical transfor- 
mations can be built from a generating function. One can wonder how general 
this construction is. 


26 Proposition. Let C be a completely canonical map between V and W 
as in Definition 17, p.220. Given two corresponding points (po, qo) E€ V, 
(To, ko) E W, (po,qo) = C(To, ko), consider the matrices (3.12.3). Then 
C can be generated near (Po, qo), (To, Ko) by a generating function, as in 
Proposition 21, p.220, and in the observations following it, having the form: 


(i) F(q, K) if det C £0, 

(ii) (p, K) if det A Æ 0, 
(iii) W (a, q) if det D £0, eee?) 
(iv) R(p, 7) if det B £0, 


Observations. 
(1) There exist completely canonical transformations for which det A = 
det B = det C = det D = 0. For instance, the map of R? x R?-R? x R? 


(p1, P2; q1, q2)— (p1, —42; q1, P2)- (3.12.26) 


This canonical transformation cannot be generated by a generating function 
of the above types. 

(2) If C is completely canonical, defined on V C R?“, it must have a Jacobian 
matrix L with non vanishing determinant, (see Proposition 24, p.236). Hence, 
there must be a choice of indices 71,...,%5,91,---je—s, pairwise distinct, with 


Pins +++» Pier Giar+++> Ges) 
OlT,- , Te) 


This means, as it can be understood with a little thought, that C can be locally 
constructed by composing a canonical transformation of the type: 


det #0 (3.12.27) 


(p1,- -PG A,- 5 QL) 


(3.12.28) 
(Di; -e3 Piss TOs ee THe gs lirs Giss Phy ++: Pies) 


[like Eq. (3.12.26)] with a completely canonical transformation generated by 
a function &(p, k). 
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(3) So any completely canonical transformation is, near a point, a composition 
of a trivial (“permutation type”) completely canonical transformation and a 
completely canonical transformation with a generating function (Arnold). 


PROOF. Suppose, for instance, det D Æ 0. Then it is possible to invert the 
equation q = Q(z, K) to express 
k = G(z,q), (3.12.29) 
for (q, m, K) near (qo, To, Ko), using the implicit function theorem (see Ap- 
pendix G). Then we can write 
p=P(x,G(n,q)) = F(7,q), s = G(m,q) (3.12.30) 


and we must show existence of Y(n, q) such that 


ZE m,a) = F(z,q), aa) = G(z,q) (3.12.31) 


defined near mo, qo. This means checking the integrability conditions: 


OF, OF, OF 0G; dG, G 


= = = 3.12.32 
qj Od; , On; qi , On; On; , ( ) 
Differentiation of the first of Eqs. (3.12.30) yields 
£ £ 
— = Áj + Bi, = Bin (3.12.33) 
On; 7 3 On; 0q; > qj 
Vi, j = 1,..., Z, with the obvious choice of arguments of these functions; e.g. 


T = To, q = qo. Differentiation of the identity k = G(r, Q(z, &)) gives 


OG; _ 
0q; 


On; = 04s 


(Dti 


Cx; = 0, (3.12.34) 


Vi, j =1,...,¢. More concisely, rewrite Eqs. (3.12.33) and (3.12.34) as 


2E = A- BD“!C, Æ = BD}, 
ə 0G _ pag (3.12.35) 
ðq On 
and the conditions (3.12.32) become 
A-BD"'C=(D"), (3.12.36) 


BD“! = (BD). ie. BD! SD Ae ,. ie DTB = BDT, 
DC = (D70), ie. SD OS (DE ie. CDT = DOT. 
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So after checking xthat Eqs. (3.12.36) are a disguised form of the complete 
canonicity conditions (3.12.11) and the proof will be complete. 

In fact, the third of Eqs. (3.12.36) is implied by the second of Eqs. (3.12.11). 
Furthermore, using the first and the transposition of the third of Eqs. (3.12.11) 
we see that? 


ABT = BAT = BD ‘DA? = BD-\(I+CB*")= BD ++ B(D"-'C)B* 
(3.12.37) 
which shows that BD~1 is symmetric since such is AB?and B(D~!C)BT 
(having already seen that D~'C is symmetric). So the second of Eqs. (3.12.36) 
also holds. Finally the first of Eq. (3.12.36) means 


AD? — BD“'!CD? =I (3.12.38) 


which, since CD? = DOT by the Eq. (3.12.11), shows that the second equality 
in Eqs. (3.12.38) simply means that ADT — BCT = I which is true, because 
it is the first of Eqs. (3.12.11). mbe 


3.12.1 Problems and Complements 


1. Let C be a map of W onto W’ and suppose that there is 6 € C™(G(C)), G(C) = 
{P q; P’; q' | (Pp, 4, Pp’, a’) € W x W’, (p,a) = C(p’,q’)}, such that 


p: dq = p' -dq’ + dë. 


Show that C is a time independent completely canonical map and that it is also “action 
preserving” in the sense that, if A is a closed curve in W’ and CA is its C-image in W, it is 


peda = |p’ da! 
Cr A 


2. Consider in R? the annulus D = {(q1,q2)|a < q4? +42 < 6, a,8 > 0}, and let 
f(qı,q2)dqı + g(q1,q2)dq2 be an exact but non integrable differential form on D. Define 


def 
C(p1,p2,91, 42) = (Pi, Ph: q1, q6) = (p1 + (q1, 42), p2 + g(q1, 42), q1, q2). 


Show that it is completely canonical (time independent, of course). (Hint: Note that pi dq} + 
phdq’, = pidqi +p2dq2 + f (q1, q2)dqı +g(q1, q2)dq2 and recall that every exact form is locally 
integrable and that the complete canonicity is a local property and use problem (1).) 


3. Show that not all completely canonical maps are action preserving in the sense of problem 
(1). (Hint: Consider the map in Problem 2 and choose A to be the curve pi = p2 = 
0,47 +43 = 3(@ + B).) 


4.* Show that the existence of 6 € C©(G(C)) verifying the property introduced in Problem 
(1) is a necessary and sufficient condition in order that C be an action preserving time 
independent completely canonical map of W onto W’. 


23 We use the fact that if M is a symmetric £ x l matrix and F is an arbitrary £ x £ matrix, 
then ETME is a symmetric matrix (exercise). 
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5. Show that the map C(pz,,py,2,y) = (p1,p2,4q1,q2) between R? x (R x R+)/(set of 
points with ps = 0) and R*/(set of points with p1qi + p2q2 < 0 or with pı < 0) defined by 
the following relation, setting i = /—1: 


p2 +iqı 
pı +iq2 


p= ps +ipy = 5(P1 Hiq), q=r+iy= 


is completely canonical (time independent). (Hint: Check that it is one-to-one and prdxz + 
pydy = Re pdq = pıdqı + p2dq2 — $d(pigi + p2q2).) 


6. Let (p',q') = C(p,q) be a map from W C R? onto W’ C R?. Show that a necessary 
and sufficient condition for C being completely canonical is that it is orientation and area 


preserving (recall that a map is “orientation preserving” if its Jacobian matrix L has positive 
determinant). (Hint: Note that the matrices A, B, C, D are numbers and therefore (3.12.5) 
holds if and only if det L = 1.) 


7. Extend the notion of completely canonical time independent map by replacing (hence 
extending) (3.11.52) by S(m) = AX (u) + constant. Discuss the case à = —1 and prove a 
proposition like Proposition 23. Find the physical meaning of A. 


8.* Consider the Hamiltonian on R? x (R x R+): Ho(px; Py, 7i y) = iy? (p2 +p), (“Hamil- 
tonian for the geodesic motion for the geometry ds? = as on R X R+, see Prob- 


lems 19-24, p.230), and show that the canonical map in Problem (5) transforms Ho into 
(pi qı + poqo)?. Write and solve the Hamilton’s equations in the new coordinates. 


9.* In the context of Problem (8) consider the canonical map p| = pn, q = 2n 1D = 
2 4 wpe 1 
P2722 q = pp and show that H is transformed by it into 5((p{)? — (45)? + (p4)? 


(q4))?. Interpret this as saying that the geodesic motions of Hp taking place at a given 
energy E can be thought of as describing the motions of two independent hyperbolic oscil- 
lators (i.e. two particles on a negative quadratic potential). How does this picture change 
as E varies? 


10.* Show that the map (p1, p2,q1, 92) > (px, Py, £, y) defined in Problem 5 is one-to-one 
from G = R4/(set of points for which piqi + p2q2 < 0) onto G! = R? x R x R4/(set 
of points for which pr = py = 0). If however the “opposite” points (p1,p2, q1, 492) 
and (—p1,—p2, —q1, —q2) are identified, the map becomes one-to-one. Then remark that 
(p1, p2,q1,q2) may be regarded as coordinates (modulo the sign) for the points of the set 
G' = {G with opposite points identified} in the same sense as a point y € R! can be 
regarded as a coordinate (modulo 27) for a point in T°. 

Using this remark extend the notion of time independent completely canonical maps to 
cover the case when W instead of being a subset of V x (JT! x R®) is a subset of G” and 
show that the map under consideration is completely canonical, in this new sense, as a map 
between G and G”. 


11. Try to extend the notion of completely canonical time independent map to maps of ar- 
bitrary open surfaces of dimension 2¢ by abstracting the essential properties of the examples 
discussed in definition 17, p.220, and in Problem (10) where the 2¢-dimensional surfaces are 
very special, i.e. they are, respectively, of the form V x T% x R®, bı +42 =L, V CRE, or 
the set G with opposite points identified. 


12. Let W be the phase for a regular time independent Hamiltonian function H, see ob- 
servation (4), p.216. Let T > 0, (p,q) € W, and suppose that the solution S;(p,q) to the 
Hamiltonian equations with initial datum (p,q) stays in W for all t € [0, T]: St(p,q) E€ W. 
Define F(p, q) = F'(S:(p, q)) and show that 
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dF (p, a) 
dt 


i.e., “the time derivative of an observable F is given by its Poisson bracket with the Hamil- 


= {H, Fi}(p, a) 


tonian” 


13. In the context of (12) show that {H, Fi} (p,a) = {H, F}(S:(p,q)): since in Physics 
the operation of associating with F € C™(W) the function CF = H,F is called the 
“Liouville’s operator action” this can be read: the “Liouville operator commutes with the 
time evolution”. 


14. Let E, F,G be in C®(W), where W is the phase space for a regular time independent 
Hamiltonian H. Show that 

{E, {F, G}} + {F, {G, E}} + {G, {E, F}} =, {E, F} = —{F, E} 

{E, FG} = {E, F}G + {E, G}F 


These relations are called respectively “the Jacobi identity”, the “antisymmetry” and the 
“derivation property” of the Poisson bracket. 


15. Show that, in the context of Problem (12), the relations (“Liouville’s equations” ) 


whips) = {H, F}(Se(p, q)) = LF (St(p, a)) 


imply, if valid for all F € C%(W), for all (p,q) € W and for t small (depending possibly 
on (p,q)), that t > S(p, q) verifies the Hamilton’s equations, (“equivalence between the 
Hamilton’s equations and the Liouville’s equations” ). 


Other problems on canonical maps can be found at the end of §4.9-4.12 and §5.10 and 
85.12. 


16. Let C(z, K) = (p,q) be a completely canonical map defined between sets U,W C R”. 
Then the Jacobian determinant of C is a matrix L with determinant det L = 1. (Hint: 


Write L as Spa) and suppose that C has a generating function F(q, K), for instance. Then 
express (p,q) as functions of (K, q) first and remark that the Jacobian of this map is 


ar arr 
O(p, qa) _ Gg ~ 3q ) 


OK, q 0 1 
whose determinant is (—1)¢ det EE. Similarly the Jacobian of the map (k, q) > (7, K) is 
aFy)— 
O(KK, q) -( on? at i 
O(a, k) (2E 1 0 


2 
whose determinant is (—1)¢ (det taba) The identity 


p- 2:9) _ OP,9) | A(«,4) 
alm, n) alka) Olm, n) 


implies, therefore, det L = 1 (from [28] p.199.)) 


Concluding Comments to Chapter 3 


(1) We have described by the word “action” certain quantities which, in fact, 
do not motivate such a nice name [see Eqs. (3.3.4), etc.]. Actually, in contem- 
porary literature, the convention of calling Eq. (3.3.4) “action of a motion”, 
or “least action principle” the corresponding variational principle, prevails. 
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This is perhaps historically incorrect: the action was introduced by Mau- 
pertuis when he formulated the variational principle bearing his name, Prob- 
lem (16) p.229.24 The numerical value of the quantity that Maupertuis called 
“action of a motion”, computed on the real motion developing under the in- 
fluence of given conservative forces and ideal constraints, is related, in a very 
simple way, to the value of the action of Eq. (3.3.4) computed on the real 
motion (see Problem 15, §3.11 p.229). The same occurs for the numerical 
value of other quantities also sometimes called “action”, see, for instance, Eq. 
(3.11.50). These simple relations explain why there is so much confusion in 
the names. However, it should be stressed that among the various notions 
of action there are simple relations only if we compare the numerical values 
that they have on the real motions: it would not make sense to ask if there 
is a simple relation between the values taken on the varied motions (mainly 
because in the different variational principles, the motions are described and 
parameterized differently and, therefore, one cannot compare them). 

(2) It is interesting to quote Maupertuis in connection with his definition of 
action, afterwards interpreted by Euler as in Problem 16, p.229 (quoted from 
[31], Chapter II, §2.8): 


We must explain what is meant by quantity of action. When a body is 
moved from one point to another, a certain action is necessary. This action 
depends upon the velocity of body, upon the space it covers, but it is neither 
the velocity nor the space separately considered. The greater the body’s velocity 
and the longer the path that it covers, the greater the action; the action is pro- 
portional to the sum of the spaces, each multiplied by the speed with which the 
bodies cover them. It is the quantity of action, the true expenditure of Nature, 
which she administers with as much economy as possible in the movement of 
light 


The last line refers to Maupertuis’ application of his principle to the prop- 
agation of light. The other lines are a nice way of saying 


£2 1 £2 
A= v: dq = — p: dq, 
&1 M Jg, 

and the condition of stationarity of A on a motion t > (p(t), q(t)) of given 
energy E can be shown to be equivalent to the stationarity of the quantity in 
Problem 16, §3.11 (a further problem for the reader). 

For a comment on Maupertuis’ definition, see the angry pages of E. Mach 
([31], Chapter IIT, §8.4). 
(3) To understand the historical development of the various principles, one 
can consult Mach, where they are critically discussed, paying due attention to 
history. In his book ([31], Chapter IV, §2), one also finds an interesting com- 


24 The original formulation was, in fact, quite obscure and it was later clarified by Euler 
(see [31], Ch. II). 
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ment on the “theological, animistic and mystical points of view in mechanics” 
(see, also, Observation (2), p.164). 
(4) Concrete and interesting exercises for this chapter can be found in the 
book [32]. 

For §3.1 and 83.3 see: 

Chapter 3 §10, §11, §12; 

Chapter 4 §21, §22, §23; 

Chapter 9 §26, §27, §28, §29, §30, §31, §32, §33; 

Chapter 10 §4, §35, §36, §38. 


For $3.3, §3.4, §3.5, and 83.8 see: 

Chapter 4, §13, §14; 

Chapter 5, §15, §16, §17, §18; Chapter 10, §37, 839, §43; 
Chapter 11, §46, $47, §48; 

Chapter 6, §19, 820. 


For §3.11 and $sec:III-12, see: 
Chapter 11 §49. 


One can also consult the book [16] 
For §3.1 and 83.2 see: 
Chapters 6 and 11. 
For $3.3, §3.4, §3.5, and 83.8 see: 
hapters 2, 3, 5, 7, 8, 10, 12, 13, 14, 17, 18, and 21. 


A 


Special Mechanical Systems 


4.1 Systems of Linear Oscillators 


In this chapter we adhere systematically to the convention of denoting and 
writing the Lagrangian functions that we shall meet as L(x, x,t) or L(x, x) or 
£(q,q,t), rather than as functions of generic variables (a, B, t): the notation 
is obviously improper since in such cases the variables x and x are not Carte- 
sian coordinates but local (or toroidal) coordinates, and often the mechanical 
systems will be described directly in local coordinates omitting the obvious 
but tedious discussion necessary when the local coordinates are not global 
(i.e., they are not globally equivalent to Cartesian coordinates). 

A typical example of this situation is when one says that a point mass 
ideally bound to remain on the sphere of radius o is described by a Lagrangian 
function given, in polar coordinates, by 


LÀ, 2,9,0,t) = T PÈ + (sind)? 4?) (4.1.1) 


After a little practice and thought, this notational convention, very common 
in literature, will appear natural and should not give rise to any confusion. 

Hence, a system of linear oscillators, each with 1 degree of freedom, is the 
mechanical system defined by 


; „n 1 
L(x,x) = z > JijLitj — 5 5 VijLiTj, (4.1.2) 
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where G = (giji j=1 pels V 5 (0i Ji j=1,..., 
definite matrices (see Appendix F, p.525). The Lagrangian equations corre- 
sponding to Eq. (4.1.2) are 


14 K 
X gij =- Ñ vijiji, aa R A (4.1.3) 
j=l j=l 


They can be treated in full generality and their theory is summarized by 
following proposition stating that essentially Eq. (4.1.3) is equivalent through 
a “simple” transformation, to £ equations of the type: 


i= wey, i=l, b (4.1.4) 


I Proposition. The most general solution of Eq. (4.1.2) fort E€ R can written 
in terms of L arbitrary non-negative constants A = (Aj,..., Ae) and of £ angles 


P = (¥1,---, Ye) as 


£ 
PAi g 
x(t) = > we n® cos(wit + pi), (4.1.5) 
i=l 4 


where w1,...,we are the £ positive solutions of the €-th order equation for w?: 


det(—w?G + V) =0 (4.1.6) 


£ 


and the vectors nN®,...,n®© verify the equation: 


—w2Gnq%4+Vn%=0, i=1,...,2 (4.1.7) 


and they can be chosen so that 


(Gn) -nO = ôi, i,g=1,...,é (4.1.8) 


Observations. 

(1) In Eq. (e4.1.5), one could of course write A; instead of ,/2A;/w;: however, 
the square root is more convenient since in this way the map (x(0),x(0)) > 
(A, œ) can be related to a canonical transformation [see Exercises for §4.1 and 
Observation (3) to Corollary 3, p.249]. 

(2) Therefore, Eq. (4.1.3) admits periodic solutions like 


[An cos(wt + y). (4.1.9) 


Such oscillations are called “normal vibration modes” or “normal motions”. 
The preceding proposition tells us that there exist £ (independent) normal 
modes, orthogonal in the sense of Eq. (4.1.8) and that every oscillation is a 
“superposition” of normal modes. 
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To underline the interest of the orthogonality of the normal oscillation 
modes, let us deduce from Proposition I, and before its proof, the following 
corollary. 


2 Corollary. The energy of the oscillations in Eq. (4.1.5) is 


£ 
B=) widi (4.1.10) 
i=1 


i.e. it is the sum of the energies of each normal mode component. 


PROOF. The energy is [see §3.11, Observation (1), p.217] 


£ £ 
1 ee 1 
= 3 y JijZitj + 3 y VijZitj, (4.1.11) 


ij=1 ij=1 


which can be written in vector form as E = $(Gx)-x+4(Vx)-x or, explicitly, 
from Eq. (4.1.5): 


2 
1 
a OS aiw sin(wi t + pi) sin(w;t + y;) - (gn, Gn) 
4,j=1 od 


Aj cos(w; t + pi) cos(w; t + p3) - (n® Vi) 
Wiw 


(4.1.12) 
and, using Eq. (4.1.5), we can replace (n®, Vn) with w? (nO, Gin), and 
by Eq. (4.1.8) plus trigonometry, one realizes that Eq. (4.1.12) becomes Eq. 
(4.1.10). mbe 


PROOF OF PROPOSITION I. Assume the existence of w1,...,w, the £ positive 
roots of Eq. (4.1.6), and of @ linearly independent vectors n®,...,n® verify- 
ing Eq. (4.1.7). Then by direct substitution of Eq. (4.1.5) into Eq. (4.1.3). one 
sees that the function in Eq. (4.1.5) satisfies, VA € RẸ, Vo = (y1, pe) € 
T“, the equations Eq. (4.1.3). 

It is also easy to see that given (n, €) € R” arbitrarily, it is possible to 
determine A € R, p € T* so that Eq. (4.1.5) verifies the datum x(0) = €, 
x(0) = 7 for t = 0. In fact the conditions 


e 
e=5 | sini COS 9j, n= — S020; A; nO) siny;, (4.1.18) 
j=l 


j=1 


imply, by scalar multiplication of both sides of Eq. (4.1.13) by Gn®,i = 
Lape! 
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2A; 
n” cos i, (Gn) -n = —vV/2Ajw; singi (4.1.14) 


Wi 


(Gn®) -€= 


by Eq. (4.1.8). Equation (4.1.14) determines A; and y; because 


(vV Ai, pi) = {polar coordinates of the point with Cartesian coordinates 


Wi s 1 7 
Gn! . £, -=G - n) € R? 4.1.15 
( z CME, wa ( ) 
Viceversa, if (A;, pi) verifies Eq. (4.1.14), it is easy to see that, since the vectors 
n,..., are £ linearly independent vectors in R! (by assumption) and, 


therefore, form a basis in Rf, Eq. (4.1.13) necessarily follows. 

By virtue of the existence and uniqueness theorems of differential equa- 
tions, Eq. (4.1.3) is the most general C® solution to Eq. (4.1.3), tE R. 

It remains to show the actual existence of @ linearly independent vectors 
nD ,...,n© and of £ numbers w1,...,we > 0. This is a well-known proposi- 
tion of algebra (see Appendix F, p.525). mbe 


It will be useful to stress a simple corollary of Proposition I. For this 
purpose, we recall the definition of the ¢-dimensional torus T’ obtained by 
identifying opposite sides of the square [0,27]’, see Definitions 12 and 13, 
p-100 and 101, and that of a function in C®°(T“) and set: 


1 Definition. Given V = (0),...,0¢) E R, the transformation of T! into 
itself, 


B= (PsP) > PEO = (1 $O1,---,9¢ +40), mod 2t (4.1.16) 


will be called a “rotation of T with parameters 9 = (01,...,0¢) E RE”. The 
group (St)ter of transformations of T! into itself defined by 


Sip = Si(yi,---, pe) = (Yi toit,...,¢¢ + wet), mod 27() (4.1.17) 


will be called the “flow on T generated by the rotation of T! with speed w or 
the “quasi-periodic flow on T! with pulsation w”. 


The following is then a corollary to Proposition I. 


3 Corollary. It is possible to establish a correspondence between all the initial 
data (n,€) € R” for Eq. (4.1.3) and the set of the points (A, p) E€ R4 x T“ 
via Eq. (4.1.15). 

The correspondence is one to one, nonsingular, and of class C% between 
(0, +00) x T® and its image in R”. 

In (A, p) coordinates, the motion of Eq. (4.1.5) is simply 


4.1 Systems of Linear Oscillators 249 


t > (A,ptwt), (4.1.18) 
i.e., it is a quasi-periodic flow on the torus {A} x T°. 


Observations. 

(1) Corollary 3 and Eq. (4.1.18) say that the motion of ¢ harmonic oscillators 
“consists of quasi-periodic motions taking place on a family of -dimensional 
tori” parameterized by £ parameters A. If one discards the data for which 
some of the normal modes are at rest (i.e., those for which some of the A’s 
vanish), one can also say that the initial data space can be thought of as 
“foliated” by an -dimensional family of ¢-dimensional tori. 

(2) The parameter A; is called the “action of the i-th normal mode”. If one 
describes the system in (A, p) coordinates in the region where A € (0, +00)*, 
it is clear that it can be regarded as a Hamiltonian system on (0, +00)! x T* 
with Hamiltonian 


£ 
h(A,p)= J uj Ai =w: A (4.1.19) 
t=l 


which leads immediately to Eq. (4.1.18). 
(3) Observation (2) leads us to think that if the original system with La- 
grangian (4.1.2) is described in the Hamiltonian form by the Hamiltonian 


1 1 
H(p, x) = aop ‘p+ 5Vx-x (4.1.20) 
[see Eq. (3.11.25)], the map (A, p)— (p, x) between (0, +00) x T* and the 
part of phase space where all the normal modes are excited (i.e. A; > 0, V2) is 


a completely canonical transformation: this is in fact true and it is the reason 


for writing Eq. (4.1.5) with ,/24+ instead of the simpler A; (see exercises). 


Wi 


4.1.1 Exercises 


1. Using Problems (1), (2), and (33), §3.11, show that the maps (p, q)— (T, K) with t = 
2,2 

Som k = q/mw, and (r, k= (E, 77) de (A, p) with y = {polar angular coordinate 

of (k, 7) E R?} are completely canonical maps. Show that performing such transformations 

2 335 

successively, one builds a completely canonical transformation changing H = $ + “4 


into H=wA. 


2. Let H(p,q) = 4G-1p -P+ 4Vq- q with G, V being two positive-definite matrices, £ x £. 
By Problem (33) of §3.11, the map (p, q)— (r, K) defined by p = VG r, q = VG~'!« (see 
Appendix F, p.525, for the definition and the existence of the positive matrix WG such that 
Ve = G) is completely canonical. Show that it transforms H into in -T+ iVe. K with 
V=VGIVVG. 7 

Let R be an orthogonal matrix (see Appendix E), transforming V into a diagonal matrix 
Q with diagonal elements w?, baat we, i.e., RTV - R = (see Appendix F for an existence 
theorem on R). Show that the further completely canonical change of coordinates m = 
RTR, k = (R7)—'K changes H into 
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1 1 ine 
a + aor -R= 5 Soa? + wR? 
i=1 
Then, for each i , by further applying the maps in Exercise 1, the Hamiltonian is changed 
in Dha w;A;: prove also this as well. 


3. Check that the variables (A, 4) constructed in Exercise 2 are the same as those appearing 


in Proposition I. 


4.2 Irrational Rotations on -Dimensional Tori 


In $4.1 a natural description of the motion of a system of harmonic oscillators 
was given as a quasi-periodic flow on 7° of the form 

Sep = (p + wt) = (p1 + wit,..., pe + wet) (4.2.1) 
Hence it is convenient to analyze a few properties of the quasi-periodic flows. 


2 Definition. The flow of Eq. (4.2.1) is “irrational” if (w1,,...,we) E€ RE 
are “rationally independent” numbers, i.e., if the relation 


£ 
new = 5 niwi = 0, ni,..., ne integers, (4.2.2) 
i=1 


implies ny =... ng = 0. 
From the definition it follows: 
4 Proposition. Let (Stjer be a quasi periodic flow defined on T* by Eq. 
(4.2.1) with w E€ RE. If p E T! and to E R, the trajectory 
N(to) = {p| p" = Sip, for some t > to} (4.2.3) 
is dense on T° if and only if the flow is irrational. 


Observation. It would be possible to provide a direct proof of Proposition 4 
along the lines of the analogous Proposition 27, p.92, §2.20, in the case @ = 2. 
However, we prefer to give an alternative proof based on the Fourier series 
and on the following proposition which is interesting in itself. 

5 Proposition. Let f € C®(T®) and let (Stjer, be a flow of the type of 
Eq. (4.2.1) on T! which is irrational. Then, Yp € T*, the average value 


Fly) = lim =f F(Syp)at (4.2.4) 


exists and is p-independent and equal to 


Ta j , dpi... dpp _ n dp 
F=f te wart = | reek (4.2.5) 


4.2 Irrational Rotations on -Dimensional Tori 251 


PROOF. Since f € C%(T*), it may be represented as 


Fe)= SD frame" = Y her, (4.2.6) 


where (ijeszt are the Fourier harmonics of f (see Proposition 28, p.103), 
and they decrease faster than any power in in |n| as |a| — oo. 

Furthermore, the right-hand side of Eq. (4.2.5) is just fo [see Eq. (2.21.13]. 
Then 


= | "(ses periz f vetted dy (427) 


nez! 


and the series in Eq. (4.2.7) is bounded above by the convergent series 


Y= fal < +00 (4.2.8) 


nez? 


because the number in curly brackets in Eq. (4.2.7) clearly has a modulus not 
exceeding 1, being an average of numbers of modulus 1. Then we can take the 
limit in Eq. (4.2.7), as T — +00, term by term. 

But the integral in the right-hand side of Eq. (4.2.7) is 


E ifn- -w0 (4.2.9) 

T insga TFE? = 
while it is 1 if n -w = 0. However, n -w = 0 only for n = 0 and all the terms 
in Eq. (4.2.7) vanish except that with n = 0 as T — +o0, and Eq. (4.2.5) is 
proved. mbe 


Note that Proposition 5 is also an immediate consequence of Proposition 
30, p.105. The same method of proof of Proposition 5 could be used to prove 
the following proposition which we describe before proving Proposition 4. 


6 Proposition. With the same hypothesis as that of Proposition 5, let T € 
R,T #0, and consider the limit 


lim >> f(Srre): (4.2.10) 


Such a limit exists and is given by Eq. (4.2.5) if the (£ + 1) numbers 
g 

w w on W1,...,wWe are rationally independent. 

Observations. 

(1) Proposition 6 is the generalization to the Z > 1 case of the Observations 

(5) and (6), p.111. The proof is left to the reader as an exercise on the proof 
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of Proposition 5. 

(2) A simple analysis of the proof of the Propositions 5 and 6 allows us to 
conclude that the limits of Eqs. (4.2.4) and (4.2.10) exist in general, but they 
will not generally be y independent unless w1,...,wg (or an wi, ...,We) are 
rationally independent. 


An immediate corollary to Proposition 5 is the following proof of Propo- 
sition 4. 


PROOF OF PROPOSITION 4. Assume that S; is an irrational flow. Let po € T* 
and let y € C~(T*) be a non-negative function having the value 1 in go, and 
zero outside a small ball oe C T! with center yo and radius £ in the metric 
of T* [see Eq. (2.21.5), p.101.] 

Apply Proposition 5 to x. We see that the average value of t > y(Si~) 
cannot approach zero, Yy € T*. Hence, for every to, there must be t > to 
such that x(S:y) > 0, i.e., Sẹ is closer to Yo than £e. This means that (2(to) 
is dense. Viceversa, if there exist integers 7,...,7% not all equal to zero such 
that T- w = 0, the function on T* defined by 


p > cos(H- p) (4.2.11) 


is not constant on T‘ but is constant on the trajectory t > Sile), t € Ry , 
for all y € T® (since T: w = 0). Therefore, for instance, the origin trajectory 
of the origin cannot approach too closely any point ~ such that cosy :Ħ < 1 
and vice versa. So §2(to) is not dense. mbe 


In the same way in which Proposition 5 implies Proposition 4, one sees 
that Proposition 6 implies the following corollary. 


7 Corollary. With the same hypothesis as that of Proposition 4, let T > 0. 
The denumerable subset of T°, 


2,(to) = {| dh integer ht > to, Y = Srey } (4.2.12) 


is dense in T® if and only if the £+1 numbers w,w1,...,we, w = 2nT7!, are 


rationally independent. 


PROOF. Exercise. 


4.3 Ordered Systems of Oscillators. Phenomenological 
Discussion and Heuristic Formulation of the Model of 
the Perfect Elastic Body (String, Film, and Solid) 


In applications, serious difficulties may be met in the use of the general the- 
ory of §4.1, and §4.2. Such use, in fact, presupposes the actual possibility of 
constructing the proper pulsations w1,...,wg and the respective eigenvectors 
n®,...,n®: their construction, in fact, passes through the solution of an ¢-th 
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degree algebraic equation, Eq. (4.1.6), and of £ linear systems of £ equations, 
Eq. (4.1.7). 

However, it is also true that in important applications, the matrices G and 
V of §4.1 are not arbitrary, but rather they have special properties sometimes 
permitting the explicit solution of the normal modes construction. 

In §4.3-4.6, some of the most interesting cases will be examined, while this 
section is devoted to the precise mathematical formulation of the models that 
will be considered. 

Let 2¢ be the d-dimensional lattice of the points € € R? with coordinate 
which are integer multiples of a > 0: 


E = (nia, N2a,..., Naa), n1,...,Nq integers (4.3.1) 


Imagine that around every site € € Z7, a mass m oscillates bound by ideal 
constraints to move on a straight line through € and orthogonal to R4. 

Furthermore, suppose that if yẹ is the elongation with respect to € of the 
oscillator in € then: 


Figure 4.1: chain of oscillators elastically bound by nearest neighbors and to centers aligned 


on an axis orthogonal to the vibrations. 


(i) Every oscillator is subject to a restoring elastic force with potential energy 


K 
= ve (4.3.2) 


(ii) Every oscillator is subject to an external force with potential energy 


mg(&) ye, (4.3.3) 
where g € C®(RĦ) (“weight”). 
(iii) Between the oscillators adjacent in Z2, an elastic force acts whose poten- 
tial energy is 


5K’ [ive — ve)? +a”), (4.3.4) 


where |g’ — €| = a and the term in square brackets represents the square of 
the elongation of a spring between the two oscillators. 

(iv) An ideal constraint forcing all the oscillators outside an open connected 
bounded region 92, with boundary 02 which is a C®-regular surface, to have 
zero elongation. Set Ra = 20 Zg. 


Only consider the cases d = 1 or d = 2 will be considered. The d = 3 case 
being a not too interesting model of an elastic solid since it can only “vibrate 
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in one direction”. The situation in the d = 1 case is pictured in Fig. 4.1 while 
the d = 2 case is pictured in Fig. 4.2. 


Figure 4.2: System of oscillators elastically bound to their nearest neighbors and to a lattice 


of centers on a plane orthogonal to the vibrations. 


Analytically, the system is described by the Lagrangian function: 


Lo =5 5 mił- > Sov-mY gêue 


EENa EENa EENa 


= £ 5 DS Tepe — Yét+ae)2, 


ECQ, e 


(4.3.5) 


where Xe denotes the sum over the 2d unit vectors directed as the axes of 
Zi: e = e1, —€1, €2, —€2,...,€4, —€q are the d unit vectors associated with 
Z4, and, to avoid double counting, v(e, £) = 2 if €,€ + ae € Ra, v(e,é) =1 
otherwise. 

In the last sum in the right-hand side of Eq. (4.3.5), the term a? appearing 
in Eq. (4.3.4) has been dropped since it produces an additive constant to Lo 
(dynamically irrelevant). 

In Eq. (4.3.5) there appear terms ye, with € ¢ Ra (in fact, if € is close to 
ðN it can happen that €+ae ¢ Na. Such terms, conforming to (iv), must be 
interpreted by setting ye = 0. 

From a physical viewpoint, the interest of the mechanical system in Eq. 
(4.3.5) lies in the fact, suggested by the above pictures, that if a is very small, 
it can be considered as a discrete model for an elastic string or film (if d = 1 
or d= 2). 

We can imagine that for small a, every “regular” initial datum (Ye, ye)eca,; 
i.e., every datum having the form 


Ye =UlS), je =v), FEM (4.3.6) 
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where u,v are functions in C°(R“) vanishing outside 2, a space that will 
be denoted C° (N), evolves remaining approximately regular, thus simulating 
the motion of a string or film. In order for this to occur, it is, however, clear 
that the parameters m, K, K’ must be suitably chosen as functions of a: their 
choice, which we adopt in the following, is motivated by a heuristic discussion. 


(a) The mass m of each oscillator must have the form 


m= pat, u >o, (4.3.7) 


since each oscillator should intuitively correspond to a small piece of the body 
with dimension a: the body will then have density u. 

(b) The constants K, K’ have to be determined so as to produce forces pro- 
portional to af on the oscillator in €; otherwise their effects would vanish in 
the a — 0 limit (if < aĉ) or they would produce infinite accelerations (if 
> a’). Hence, since the force associated with K is —Ky, it must be: 


K =aa", ao>0 (4.3.8) 
The force exerted by the two oscillators in — ae and +a e on the oscillator 


in € is 


—K'|(ye — yetae:) + (Ye — Yé-ae: )], (4.3.9) 


and if yg can be assimilated to u(&), u € CF (2), we can compute Eq. (4.3.9) 
using the Taylor-Lagrange expansion to second order as 


2 
a 
Ye — Vezaoi = U(E) ~ u(E + aei) = Fadiu(g) — 5 ule) + O(a), (4.3.10) 
where 0;u, 0?u are short notations for ae oe. Then Eq. (4.3.9) becomes 


K'a? 0?u(€) + O(a?) (4.3.11) 


which indicates that it must be set 


K'a Sra ee, (4.3.12) 
With the above choices of K, m, K’, Eq. (4.3.5) becomes 


a) H 7 2 
£4 ga D R- Fat — Sal E ub — nat D Ow 


EEN, END, ED, 
T d 1 (Ye JE Yetae)” (A313) 
oo 2 2 v(e, £) a? l 


This model is not yet completely correct from a physical point of view. The 
heuristic discussion so far presented has been dealt with by supposing that € 
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was far from ðR: if € is adjacent to Q, it is not quite clear what is meant 
by ye being regular since the functions u,v approximating it in Eq. (4.3.6) 
cannot be a independent, as supposed. A look at Fig. 4.3 suffices to realize 
this. The points outside 22 and adjacent to it have a rather erratic structure 
and, quite delicately, are a dependent. 


eke) 


Figure 4.3: The erratic mismatches between the regular lattice and the boundary of 22. 


Though this point may superficially appear irrelevant, it in fact has some 
importance at least as far as the correct formulation of the meaning of “regular 
datum yg, ye” is concerned. 

In the d = 1 case, the difficulty can be simply avoided by supposing that 
a is chosen always so that 02 (which now consists of two points) is always on 
Z1: in this case, therefore, we shall actually do so and we shall assume that 
the system (4.3.13), with the above restriction on the “allowed values” of a, 
is a “vibrating” or “elastic string” model. 

In the d = 2 case, it is obviously not possible to circumvent so easily the 
difficulty and, to understand what to do: let us again refer to some heuristic 
physical considerations. 

When one imagines an elastic homogeneous film oscillating with a fixed 
boundary ôN, one probably has in mind the following situation: one deposits 
an elastic homogeneous film on a plane and then “glues” the film on the plane 
at 02 and, afterwards, lets it oscillate and studies (or watches) the oscillations. 

When the surface is described, as in our case, by linked oscillators, the 
corresponding procedure is that of setting the oscillators in their equilibrium 
positions on Z® and then pinching (with “glue” or “nails”) the springs con- 
necting the points € € 2 to the points é’ = €+ae ¢ Q at the point E+ee 
where the segment €£€’ crosses 02. Once this is done, the system is allowed 
to oscillate. 


é’t+ae 


£ 
Figure 4.4: The pinching to adapt to the boundary condition. 
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On the boundary of 22, the situation drawn in Fig. 4.4. is produced. This 
means that the elastic constant binding ye to OLH is different from K’ contrary 
to what, instead, is hypothesized in LO, Eq. (4.3.13). In fact, yg is pulled 
from ðN by a spring with elastic constant 


K= K's (4.3.14) 


because the elastic constant of a piece of spring with elastic constant K’ 
obtained by pinching it at a distance € when the spring is elongated by a is 
given by Eq. (4.3.14). 


Figure 4.5: Illustration of the system of oscillators corresponding to Eq. (4.3.16). 


Then, for € € Na, we set 


Eal, a) =a if E+ae ERa 


4.3.15 
Eal, e) ={distance between € and OAN NEEE +ae) otherwise ( ) 


and the above considerations are summarized in the following Lagrangian 
function which will be supposed to be our discrete model of the elastic string 
or film (see Fig. 4.5), discarding the simpler but more naive model of Eq. 
(4.3.13): 


a) P (is atone 
Lo =v 5 Ve = rials ra 5 Ve — pat 5 G(E) Ye 


EEN, ten. anh 

2 (4.3.16) 
ey See 
2 EEN, e v(e, £) Eal€,e) a2 : 


Here the values of yg) when €’ ¢ 2, present in Eq. (4.3.16) if € is close to 
OM and E+ €a(€,e) = €’ € ON, have to be thought of as vanishing. Or, more 
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generally, we may fix the film or string at preassigned elongations on O22, 
described by a function h € C% (8 N).! 

In this case, the values of yg’, for the above €’s are to be thought of as 
given by 


ye = h(é’) (4.3.17) 


It is clear that Eq. (4.3.16) differs from Eq. (4.3.13) only because of the 
terms for which € is adjacent to 022. 

It is also clear that the critique to Eq. (4.3.13) raised above can no longer 
be applied. For instance, if h = 0, the initial datum ye, ye, very naturally, can 
be called “regular” if, VE € Qa, 


ye = U(E), Ye = vlé) (4.3.18) 


and u,v € C§°(Q) = { set of the C® functions defined in a neighborhood of 
N and vanishing outside Q}. 

In the upcoming sections, we shall study some properties of the motions 
of the system in Eq. (4.3.16) and (4.3.17), paying attention to the problem 
of regularity for the motions with initial conditions Eq. (4.3.18) and to their 
interpretability as motions of a string or film. 

If d = 3, Eq. (4.3.16) still makes sense, but it not longer provides a natural 
model of an elastic solid. However, it becomes much more natural if yg, instead 
of being a scalar quantity (ye € R), is thought of as a vector in R®. In this case, 
by thinking ye € RÌ, instead of ye € R (as done so far), Eq. (4.3.16) would 
yield an interesting (though rather special) model for the elastic deformations 
of a solid. However, the case d = 3 will not be further examined. 


4.4 Oscillator Chains and the Vibrating String 


Consider the Lagrangian function of Eqs. (4.3.16) and (4.3.17), supposing 
Q = [0, L] and a such that L/a = N is an integer. 

Therefore, this function describes a system of N + 1 oscillators, the first 
and the last of which are fixed at given heights. The Lagrangian of Eqs. Eq. 
(4.3.16) and (4.3.17) becomes 


= (i a T A (Yia — Yiata)? 
+2 y a FD) 2 ia Yiata 
> (Faiz, + nag(ia)iy2, — Za va) — e ee 
Yo = ho, YL Er hL, g € Co (R). (4.4.2) 


1 A function f defined on a regular surface X C R4 is said to be in C% (X) if in any local 
system (U, Æ) of regular coordinates, its restriction to X MU is a C% function of the 
coordinates of the points of U N § in (U, Æ) (see Definition 10, §3.6, p.170). 
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The equations of motion for Eqs. (4.4.1) and (4.4.2) become 


Ta ; T 
HÜia = paglia) — cayia — G (2yia = Yiasa— Yiexa) (4.4.3) 


for i = 1,..., N — 1, where yo = ho, yL = hz, p > 0,0 > 0,7 > 0. 

This is a system of linear non homogeneous differential equations which, as 
usual, we shall study by writing its solutions as sums of a particular solution 
and of a solution of the homogeneous equation, which is obtained by setting 
g = 0 and ho = hz = 0. 

Let us first study the homogeneous equation. The results of the following 
analysis are summarized by Proposition 9 at the end of this section. 

In the homogeneous case, Eq. (4.4.3) correspond to the Lagrangian equa- 
tions Eq. (4.4.1) and (4.4.2) with g = 0, ho = hz = 0. This is a system of 
oscillators of the type considered in §4.1 with, i,j = 1,...,N — 1, 


Gij = pa ij, (4.4.4) 
Vij = 0a ðij + = (254) — ĝij+1 — Oij-1), (4.4.5) 
This can be checked immediately by noting that if y = (7:)i=1,....v—1, one 
finds (setting yo = yn = 0) that Eq. (4.4.5) yields 
N-1 Na 
5 Vig ivi = a0 2, G +2 Gem eH)? (4.4.6) 
1,j=1 a= j=0 


To solve the system w?n — Vn = 0 [see Eqs. (4.1.6) and (4.1.7)] remark that 
such a system has the explicit form 


2Nja — Nja+a — Nja— 
— UW? Nja = —ONja — rlia = Mata — ja-a) ieza ‘hia a) (4.4.7) 
where j = 1,..., N — 1 and ņo = nz = 0. 

The manifest analogy between this equation and the linear differential 
equation —w?n = —on — Tn", suggests to look for solutions of Eq. (4.4.7) 


having the form 
nia = > Boe®?4*, Ba, ap EC, (4.4.8) 
e 


where g is a summation index. 
In order that e%¢/* is a solution of Eq. (4.4.7) for j = 2,..., N — 2, it must 
be [by substitution of Eq. (4.4.8) into Eq. (4.4.7), j = 2,...,N — 2]: 


2 Bel e%e? + Ee Foe 
2 


If w is such that this equation for a, has a solution a, then —a is also a 
solution. Hence it seems natural to try to solve Eq. (4.4.7) with 7 given by 


(4.4.9) 


a2 
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Nja = Bp ei + B_en8#, j=1,...,N=1, (4.4.10) 


where a and w are related by Eq. (4.4.9). 

The only equations of the system of Eq. (4.4.7) that Eq. (4.4.10) still may 
fail to verify are the first and last. If 7 has the form of Eq. (4.4.10), such 
equations become equations for B4: 


5 ((-uw? + o) + 5e — er 0) Bek N -Da =0 (4.4.11) 
o= 


corresponding to Eq. (4.4.7) with i = (N — 1) or, for i = 1, 


Y ((— pw? +0) + (2 — e284) B e224 =0 (4.4.12) 


o= 


which, by using Eq. (4.4.9), become, respectively, 


T F 
5 AE O zbe =0. (4.4.13) 


=E Q=x 


The latter two homogeneous equations, in the two unknowns 3, and (_, 
have a nontrivial solution if the determinant of the coefficients vanishes, i.e., 
it must be 


gN] (4.4.14) 
and, in this case, 6, = —G_. Hence, (i = y —1): 


poe a. REO na N SE (4.4.15) 
Na 
to which correspond the solutions [see Eq. (4.4.10)| 
mie) = Bsin-—hi, h=0,1,...,N—1 (4.4.16) 


with the respective eigenvalues w? given by Eq. (4.4.9): 


Tha) 


o 7 2(1—costta 


2 _ = 
EEH a? 
The N — 1 solutions (4.4.16) are linearly independent vectors: they are, in 
fact, orthogonal. This follows from the general theory of Appendix F, p.525, 
since w21 > wh, h = 1,..., N —2, but the following direct check is somewhat 
instructive. Let, in fact, 1 < h, h’ < N — 1; then? 


(4.4.17) 


? Since cosy = Re (e*”). 
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1 


Do nE =p? Y a Thy sin Ti 
— N 


j=l 

P SS ( m(h — h')j mh T mi) 
=5 2 cos N — cos N 

2 N-1 
=E Re (cit tH 4/2 — einttth'ys/2) 

aren 
_P eithr) _ 4 Cae eat 
= Re (sas = aOR) 


which, if h = h’, has to be interpreted as 6?.N/2 and, if h 4 h’, is zero since 
ene Ne = ein(hth’) — = +1 and Re (e* — 1) = 4 = —4, Va € R. Therefore, 


N 
1) 1) = BP dyn (4.4.18) 


Hence, using the results of §4.1, the most general motion of the N—1 oscillators 
described by Eqs. (4.4.1) and (4.4.2) with ho = hz = 0 and g = 0 is, Vj = 
1,...,N—1, 


ree i 
Yja = > An ~ (sin = ja) cos(wpt+yn). h=1,...,N—1, (4.4.19) 


where wp > 0 is given by Eq. (4.4.17) and A, > 0, pn € [0,27] are arbitrary 
constants. 

A particular solution to Eq. (4.4.3) can be found as follows. Obviously, the 
simplest particular solution is, if existing, a stationary one, y(t) = c, ie. a 
solution of the system 


26464. C76; a — Cja—a : T 
O Cha + SS = uglja) + ga OsnN-ihr + ô; ho) (4.4.20) 
for j = 1,..., N — 1, where cp = cz = 0. These equations immediately follow 


from Eq. (4.4.3) in which the terms with the time derivatives have been elimi- 
nated and the inhomogeneous terms depending on g and h have been brought 
to the right-hand side. 

Call y the vector y = (Yia)i=1,...,N—1 defined by the right-hand side of Eq. 
(4.4.20). Recalling the definition of V, Eq. (4.4.5), Eq. (4.4.20) can be written 
as 


a Ve=y. (4.4.21) 


This equation has one and only one solution because V, by Eq. (4.4.6), is 
positive definite (so det V > 0) if ø > 0,7 > 0 and its solution c is a particular 
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solution to Eq. (4.4.3) and, in fact, it is the only stationary solution to Eq. 
(4.4.3). 
It is even possible to find a useful expression for c. 


If in Eq. (4.4.16) we choose 3 = 4/ Ẹ, we see that Eq. (4.4.18) says that 
n®,..., n) are (N — 1) vectors with N — 1 components forming an or- 
thonormal basis in R’~!. Furthermore, these vectors are such that, by con- 
struction, [see, also, Eq. (4.4.7)] 


atVn™ = pw? nh. (4.4.22) 
Hence it follows 
N-1 
y=, (4.4.23) 
k=1 
N-1 
c= X` Akn”, (4.4.24) 
k=1 


where the (k) are unknown and, setting Na = L, 


Fk) = m -) ye uglja) sin ZË ja) 


(4.4.25) 
T Ttk . qk 
+ zao sin -ya + hzsin 7O — 1)a)} 
Using Eq. (4.4.22), Eq. (4.4.21) becomes 
G(k) = p twr’ Alk) (4.4.26) 


and provides an explicit expression for the components of c on the “natural 
basis” n®,..., n9. 

Before stating a proposition summarizing all of the above remarks, it is 
useful to give a very interesting definition allowing a suggestive interpretation 
of Eq. (4.4.21). 


3 Definition. Let N = [0, L], L/a = N = integer. Define the “finite differ- 
ences Laplace operator relative to Za” as the (N — 1) x (N + 1) matriz D 
associating the vector (D) ja) with the vector 6 = (Sja)? o so that? 


Ojata ST 25ja ae Oia=a 


(D6) ja = 4 j=l,...,N-1. (4.4.27) 
3 The matrix elements of D are Dij = — 515 + Bitti, i=1...,N-—-1,j = 


0,...,N. 
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In this notation, Eq. (4.4.21) can be written as 


(oc Fj TDe)e T ug(€), fe 24/02, 


4.4.28 
Ce = he, EE ON. ( ) 


4 Definition. Equation (4.4.28) for the vector c will be called, for o > 0,7 > 
0, a “discrete non homogeneous Dirichlet problem for the region Q on Z} with 
boundary datum h, interior datum ug”. 


The already remarked existence and uniqueness of the solutions of Eq. 
(4.4.21) can be phrased as follows. 


8 Proposition. Equation (4.4.28) admits one and only one solution for ar- 
bitrarily given boundary and interior data and for allo > 0,T > Q0. 


Concluding this section its results are summarized by: 


9 Proposition. The motions associated with Eqs. (4.4.1) and (4.4.2) have 
the form 


Nea 
(a) (4) — (a) |2 (gin Dh 
ye (t) = eg" + > An T (sin ae cos(wnt + Yn) (4.4.29) 
for E € Na with 
o  rT2(1-— costa) 


Wh = = 
H H a 


and the vector c®) = (ce)ecn, is the solution to the Dirichlet problem (4.4.28) 
with boundary datum h and interior datum g. The vector ce is given by 


(4.4.30) 


2 . 


e D Din De g(é") sin = ke’) 
See Ge ee p (4.4.31) 


T . Tk . 1k 
+ z sin a + hzsin zA- 1)a)} 


Observation. The normal modes have a remarkable “spatial structure”, i.e., a 
remarkable € dependence. They are in fact interpolated by sinusoidal functions 
with “two nodes”, i.e., two zeros, at the “extremes of the string”, 0 and L, 
and in the h-th normal mode such a function has exactly (h — 1) other nodes 
in [0, L]. This is a complete description of the “wave-form” of the modes. 
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4.5 The Vibrating String as a Limiting Case of a Chain 
of Oscillators. The Case of Vanishing g and h. Wave 
Equation 


The motivation for the choice of the Lagrangian (4.4.1) and (4.4.2) lies in the 
request that the mechanical system described by it be a good model for the 
oscillations of an elastic string. 

In this section it will be shown in a mathematically precise sense how this 
property is actually realized in the models of Eqs. (4.4.1) and (4.4.2) when g 
and h vanish. We shall suppose o > 0,7 > 0. 

To get an idea of what to try to prove, remark first that Eq. (4.4.3) has a 
formal limit given by 


O7y O7y 
yo = ho, yt =hz, (4.5.2) 


as a — 0, while Eq. (4.4.28) for the “center” of the oscillations becomes, still 
formally, 


d2 
ocg — We = ug(£), € € (0, L], co = ho, cr = hz. (4.5.3) 
Hence the following proposition should look natural.. 
10 Proposition. Let t > y(t), t € R, be the solution of Eq. (4.4.3) with 
g=h=0,020,7>0,p>0, following the initial datum 
y® (0) =uo(ja), = f=1,...,N-1 (4.5.4) 


y(0)=w(ja), f= 1,...,N-1 (4.5.5) 


where uo, vo E€ C§°((0, L)) = {functions in C®([0, L]) vanishing in a neigh- 
borhood of 0 and L}. Then, Vt E€ R, Vx € [0, L], the limit 


lim y® (t) = w(z,t) (4.5.6) 


a—0O 
Ea 


exists and defines a C® function on (0, L] x R, verifying the equations: 


8w 8w 
Mas TA +ow = 0, V (x,t) € [0,L] x R (4.5.7) 
w(x, 0) = uo(2), Va € (0, L], (4.5.8) 
Fre 0) = volz), Va € (0, L], (4.5.9) 


w(0,t)=0=w(L,t), VtER. (4.5.10) 
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Equations (4.5.7)-(4.5.10) admit one and only one C® solution: this solution 
is explicitly given by Eq. (4.5.19) below. 


Observations. 

(1) This proposition makes precise the fact that a “regular” initial datum 
evolves through Eq. (4.4.3) into a “regular configuration”. Furthermore, it ex- 
plains why Eq. (4.5.7) is called the “wave equation” describing the oscillations 
of a string with density u, tension 7, and restoring constant o. In the case 
ao = 0, Eq. (4.5.7) is the “D’Alembert wave equation” for the vibrating string 
oscillating under the only action of its tension 7T. 

(2) The derivation of the wave equation presented here and its theory, as ex- 
pressed by Proposition 10, starting from the theory of harmonic oscillators, is 
a celebrated theorem of Lagrange. 

(3) Another explicit solution to Eqs. (4.5.7)-(4.5.10) can be found in Problem 
11, p.270, (see, also, §4.7). 


PRoor. Write Eq. (4.4.29) as 


o- 5 {i ER RETE = sin Tie sinunt) (4.5.11) 
Ye 2 nl pin nt + Bry sins ntj, (4.5. 


where € = ia, i = 1,..., N — 1 and try to determine Ais Bh, by imposing the 
initial data. 

Consider the initial data of Eqs. (4.5.4) and (4.5.5) as (N — 1)-component 
vectors and express them as linear combinations with suitable coefficients, of 


the vectors 9,...,9%—)) with components (n'")); = 4/2 sin %4}, which 


(as seen in §4.4) form an orthogonal basis in RNT! [see Eqs. (4.4.16) and 
(4.4.18)]: 


‘ 


A |2 `- Th or 
volé) = o(h) KTE E= iq, OS Lee = 1, 


After Eq. (4.5.12), it becomes immediate to impose the initial data of Eqs. 
(4.5.4) and (4.5.5) to Eq. (4.5.11): 


Volh) 


An = Uo(h), Bp = 
Wh 


(4.5.13) 


Since, on the other hand, tio(h) and %o(h) can be obtained by scalar multipli- 
cation of the vectors of Eqs. (4.5.4) and (4.5.5) by n, Eq. (4.5.13) yields 
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24, = Z 2 (sin Teul), (4.5.14) 
E E E _ th 
Were aN 2 (sin FS) vol); (4.5.15) 


and de runs over € = ia, i = 1,...,N — 1. 

Then, by the assumptions on uo and vo, Eqs. (4.5.14) and (4.5.15) contain 
summations over € which, after being multiplied by a, are the Riemann sums 
for the integrals between 0 and L of the functions x — (sin 34x)uo(x) and 
x — (sin 2)uo(zx), x € [0, L]. Hence, 


2~ 2 ies h 
lim Wie = al uo(2) (sin 2) dx, (4.5.16) 
TY PEA aea a (4.5.17) 
lim No aE Oe sin [-«) dz, 5. 
where, for h = 1,2,... [see Eq. (4.4.30)], 
re nye ee a (4.5.18) 
a are ir aa ame M M L RoN 


Hence, we see that the sum (4.5.11), thought of as a series in h (with vanishing 
terms for h > N), converges term by term, as a > 0 and € —> a € [0, L], to 
the series 


= h f2 f” h 
w(x,t) =>) sin are (E f uo(2’) sin sea! dx’) cos@(h)t 
iat ie (4.5.19) 
nr. Th ,, ,sinw(hyt 
+( f volz’) sin Tt) sh) } 


We now show that the series in Eq. (4.5.19) is uniformly convergent in 
t and x and defines a function w verifying Eqs. (4.5.7)-(4.5.10). This will 
mean that a function w verifying Eqs. (4.5.7)-(4.5.10) does exist. Then we 
shall prove Eq. (4.5.6), and the proof will finally be concluded by proving the 
uniqueness of the solution to Eqs. (4.5.7)-(4.5.10). 

All of the above deductions are based on the following lemma, a corollary 
to the Fourier theorem, proved in Appendix I, p.536. 


11 Lemma. Let C” ([0, L]) be the set of the C™([0, L]) real functions van- 
ishing together with all their even derivatives in the points O and L. Set 


PA k 
Uk = if u(x’) sin <a! da! (4.5.20) 
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Vue C ((0,L]); then it follows that: 
(i) Va > 0, ICa such that 


[T| < Ca (1+ k*)7*, Vka IDa (4.5.21) 
= h 
(ii)u(x) = $` Tp sin Tr (4.5.22) 
h=0 


(iii) Equation (4.5.22) can be differentiated term by term an arbitrary number 
of times, giving rise to uniformly convergent series. 

(iv) Every function of the form of Eq. (4.5.22) with U verifying Eq. (4.5.21) 
is in C” ([0, L]). 

Observation. Clearly C~ ([0, L]) > C°((0, L)). 


The proof of Proposition 10 can be continued as follows. 

The uniform convergence in t and x of Eq. (4.5.19), as well as the admis- 
sibility of its term-by-term differentiations, follow from (i), Eq. (4.5.21). Call 
w the sum of the series (4.5.19): it verifies Eq. (4.5.7) because every term of 
Eq. (4.5.19) does [see Eq. (4.5.18) and do a direct check]. 

Equation (4.5.10) holds since sin 342 vanishes in 0 and in L, for all integers 
h. Equations (4.5.8) and (4.5.9) can be checked by computing w(z,0) and 
% (x, 0), from Eq. (4.5.19), using (ii) of Lemma 11. 

It remains to prove Eq. (4.5.6) and uniqueness. Since Eq. (4.5.11), thought 
of as a series in h by setting Ah, B, =0forh>N , converges term by term to 
the function in Eq. (4.5.19), we simply have to show that the series (4.5.11) 
is uniformly convergent in a and € (or, what amounts to the same, in N and 
£). It suffices to show that given a > 0 there exists C/, such that 


|2 > C! 
205 Cr 
fy Bal S ea Wale (4.5.24) 


Let us, for instance, prove Eq. (4.5.23). From Eqs. (4.5.14) and (4.5.22), 
one obtains, Yh = 1,..., N — 1, 


N REI 
e (4.5.25) 
=X To (> 5 sin —€ sin Tke 
k=1 N E=ia L 


iSl ciy N-1 


and, by Eqs. (4.4.16) and (4.4.18), it follows that for h = 1,..., N — 1 and k 
arbitrary (even for k > N), 
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rh, . wk 
= 5 sin TE sin TE = Ok,h — Ôk, 2N—h + Ôk,h+2N — --- 


aaa $ (4.5.26) 
= y Ok,h+2pN — Y Ôk 2pN—h- 
p=0 p=l1 


Hence, by Eq. (4.5.21), Va > 0,Vh=1,...,N—1, 


7 z = Co 
— Ap| =|Tor —Uoon_-n t+..-| < T < = 
|y 7p Anl =lox — To2N-: E22 ontel < D TE p 


4.5.27 
i Ge ( ) 


Bye Oa 6 Oe (ne 
© Bao Vrhe af Ee VI+ he over 


implying Eq. (4.5.23) by the arbitrariness of a and because A;, = 0 for h > N. 

To show uniqueness, it is enough to show that if w? € C®([0, L] x R) and 
verifies Eqs. (4.5.7)-(4.5.10), with uo = vo = 0, then w? = 0. 

The idea of the proof is based on energy conservation. Equations (4.5.7)- 
(4.5.10) should “keep memory” of the fact that they are a formal limit of Eq. 
(4.4.3) and it should be possible to define, for every motion w verifying them, 
a function which is constant as t varies and which can be obtained as the limit 
a — 0 of the energy expression for Eq. (4.4.3). If yo = yz = 0, the energy of 
the motions of Eq. (4.4.3) is [see Eq. (4.3.13)] 


a) _@H 2s SO 2 
pot Se De 
2 (4.5.28) 


a2 


L L L 
o o 
E(w,t) = ot (Fae + a w°dz + d (G da. (4.5.29) 


If we show that the solutions of Eqs. (4.5.7)-(4.5.10) in C% (J0, L] x R) 
are such that E(w,t) remains constant as t varies, uniqueness is proved. In 
fact, if w(x,0) = 0 and 3 (x, 0) = 0, then E(0) = 0, on the other hand 
E(w,t) = 0 => w(t,x) = 0,Va € [0, L], if o > 0,7 > 0. But, the difference 
between two solutions of Eqs. (4.5.7)-(4.5.10) is a solution with uo = vo = 0 
with zero energy: hence, it vanishes identically. 

To show the constancy of Eq. (4.5.29) remark that 


L Aw Ow ðw Ow 8 dw 
Geet) = f ET + ow + Pas an ar) dx (4.5.30) 
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Then integrate the last term in the right-hand side by parts using 2 (0, t) = 
a (L,t), by Eq. (4.5.10). Collecting the integrals into a single integral and 


taking Eq. (4.5.7) into account, one finds 


dE ph ow dw, ow 
dt J, ot OP 


Observations. 
(1) From the proof, one can see that the condition uo, vo € C§°((0, L)) has only 
been used to apply the Lemma 11 through the observation that C§°((0, L)) c 
C™((0, £). 

It is then clear that Proposition 10 can be strengthened by replacing the 
assumption uo, vo € CS°((0, L)) with the assumption uo, vo € C ([0, L]) and 
by substituting Eq. (4.5.10) with 


Ow 
Ot 
(where - denotes a dummy variable; in this case, x € [0, L]). 

In this way the existence and uniqueness theorem for the waves equations 

(4.5.7)-(4.5.9) and (4.5.32) with initial datum uo, vo € C~ ({0, L]) is more sat- 
isfactory because the initial regularity condition is not modified as t evolves. 
In fact, from the above proof it is not possible to conclude (and it is gen- 
erally false) that when the initial configuration uo, vo is built with elements 
of C§°(0, L)), then also the evolved configuration at time t, w(x, t), 9 (a, t) 
consists of elements in C§°((0, L)) (i.e., the initial regularity is generally not 
preserved). 
(2) One may think that uo, vo € C~ ({0, L]) is still not optimal and that, per- 
haps, the optimal condition could be uo, vo € C™([0, L]) plus uo(0) = uo(L) = 
0, vo(0) = vo(L) = 0. By counterexamples, it can be shown that this is not 
the case (see exercises). To further extend the set of the initial configurations, 
one has to give up C% smoothness. 


w(-,t) and —(-.,t) € © ([0, L]), Vt € R. (4.5.32) 


4.5.1 Exercises 


1. Consider the wave equation for (x,t) € R? 


8w 8w 


at? c Ox? 


Given u,v E€ C® (R), show that 


ee u(x + ct) + u(x — ct) +f aE 


v 
2 weer 2c 


is a C® solution verifying the initial datum (u, v). 
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2. In the context of Problem 1, suppose that ists (u(x)? 4 P(A )?)dx < +00. Show 


1 dw\? , (aw)? 
that w is the only C™ (R?) solution “with finite energy E = $ fr (3) + (3) Jae 


and datum (u,v). (Hint: Repeat the energy conservation argument at the end of the proof 
of Proposition 10.) 


3. Find the relations between u and v, in the context of Problem 1, necessary to guarantee 
that w is a “purely progressive” or “purely regressive” wave, i.e., w(x,t) = a(x — ct) or 
w(x, t) = b(a + ct). 


4. Let u € Cf? ((0, +00)) and suppose that u(x) = 0, unless x € (a,b), 0 < a < b < +00 and 
u(x) > 0 for x E€ (a,b). Let v(x) = c2 (a). Show that, up to a time to > 0, the solution w of 
the equation ew — c? Žu = 0 with initial data (u, v) is such that w(x, t) € Co ((0, +00)) 
for t < +00. (Hint: Use Problem 3 by noting that up to to = a/c the solution is w(x, t) = 
u(x + ct).) 
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5. Consider the wave equation on (0, 1], oS Cor = 0, with the initial data vo = 
ee 

—c 40 ug (x) = zrela) foro <r < 4, uo(x) = 0 for |x| > 4. Letting n > 1, show 

that up to to = +, the function w(x,t) = uo(x — ct) if 0 < x — ct < 4 or w(x,t) = 0, 


otherwise, is a C2”) ([0, 1]) solution following a C~((0, 1]) datum. Infer that the conditions 
uo, vo €E C™ (0, L]) in Proposition 10 cannot be replaced by the more general ones of the 
Observation (2), p.269. (Hint: Show by the same energy conservation argument at the end 
of the proof of Proposition 10 that there is uniqueness for the CC) solutions of the wave 
equation, etc.) 


6. Is the condition 7 > 0 in Proposition 10 essential? If yes, give a physical interpretation 
of the reason. 

2 2 
7. A solution to the equation oy ere +m?w = 0, (x,t) € R?, having the form 
i(kas 


e Ect) is called a “plane wave” solution. Its real and imaginary parts are called “real 


plane waves” solutions. Find the plane wave solutions to the above equation. 


8. Find the energy per unit length of a real plane wave solution to the equation in Problem 


7. (Hint: B® limp soo 3+ SE, (sey + c2(2e)? 4 mw?) dz...) 


9. Formulate and prove Proposition 10 in the case when the segment [0, L] is replaced by a 
closed circle, i.e., the oscillators in Fig. 4.1 are ideally bound to the set of equispaced lines 
orthogonal to a circle with radius R, obviously without fixed extreme oscillators (“periodic 


boundary conditions”). Show that Eqs. (4.5.7)-(4.5.9) remain the same while Eq. (4.5.10) is 


replaced by uo, vo E€ C®(T1!(27R)) def Qe periodic functions with period 27R. (Hint: The 


ordinary Fourier theorem replaces Lemma 11 in the proof (which actually becomes easier).) 


10. In the context of Problem 1, call Vo(x) = ff vo(€)dé. Show that to compute w at the 
point (x,t), it is enough to know the data uo, Vo at the points x + ct (“propagation along 
characteristic lines” ). 


11. Consider the wave equations (4.5.7)-(4.5.10). Define Wo, vo as 
Uo(x) =uo(), if O<ae<L, 
uo(L + x) = — uo(L — 2), if L<L+x<2L, and 
uo(x) =uo(x — 2kL), if «—2kL € [0,2L). 


Likewise, define Go. Show that To, To are C% (R) functions if and only if uo, vo € C~ ([0, L)). 
Let Vo(x) der Jo Do(E)dé. Show that the solution to Eqs. (4.5.7)-(4.5.10) can be written 
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wees nS ee S 


(see, also, 84.7). Find a statement analogous to the one in Problem 10 in terms of Uo, Vo. 


4.6 Vibrating String: General Case. Dirichlet Problem in 
[0, L] 


Having in mind the results of §4.4, it is convenient to study preliminarily what 
happens to the stationary solution c™ [see Eqs. (4.4.29) and (4.4.31)] in the 
limit a > 0,€ > a. 

The heuristic considerations at the beginning of 84.5 suggest the following 
proposition. 


12 Proposition. The stationary solution c of the oscillator-chain equations 
(4.4.1) and (4.4.2) given by Eq. (4.4.81) is such that the limit 


c(x) = lim oO (4.6.1) 
be 


exists for x € [0, L] and defines a function c e C™([0, L]) such that 


dc 
oc- z559 «xe [0,1], (4.6.2) 
dx 
c(0) = ho, c(L) = hr. (4.6.3) 
PROOF. Define 
N-1 
(a)1 def ep ed 22, fe a TER 2g 
eo 2 tn TON da ) sin 6), (4.6.4) 
N-1 
(a)2 def wk 1 rT 
CES a 7 ee (4.6.5) 


for € =ia,i=0,1,...,N,N = L/a, and by Eq. (4.4.31), 


ce = 1 4 e02 (4.6.6) 


and c(! solves Eq. (4.4.28) for h = 0 while c\“)? solves it for g = 0. 
We shall separately show the existence of the limits: 


lim c+ = c (x), (4.6.7) 


lim c? = ce?) (x), (4.6.8) 


272 4 Special Mechanical Systems 


and that, furthermore, they define two C'™([0, Z]) functions verifying Eqs. 
(4.6.2) and (4.6.3) with h = 0 or g = 0, respectively. 

First study Eq. (4.6.7) using Eq. (4.6.4) as a starting point. Think of Eq. 
(4.6.4) as a series in k with all the terms with k > N vanishing, then such a 
series converges term by term, when € — x, a — 0 to the series 


oo L 
c (x) = S (sin H l Gf g(a’) sin wea! dz’), (4.6.9) 


where W(k)? = 2 +7 (ZE)? = lima w2 is given by Eq. (4.5.18). 

If g € C” ([0, LJ), we could infer from the Lemma 11, p.266, Eq. (4.5.21), 
that the above series is a uniformly convergent series, term by term indefinitely 
differentiable. It would then be clear that c“) verifies Eqs. (4.6.2) and (4.6.3) 


with h = 0 since 


2H Po tk 2 L rk 
a) = Po r ca I os ha 1 
IE E eg dee Tt) # Gi g(2') sin =~ dx) (4.6.10) 


and by Lemma 11 the right-hand side is just pg. 

It would also be easy to prove the validity of Eq. (4.6.7) with c defined 
by Eq. (4.6.9). One should repeat, word by word, the §4.5 proof where the 
convergence of y(t) to its “term-by-term limit”, Eq. (4.5.19), is discussed. 

In the present case, however, g € C™({0,Z]) but not necessarily g € 
C” ([0, L]), and the proof of Eq. (4.6.7), of the convergence of Eq. (4.6.9), 
and of the C™({0, L]) nature of c is more delicate. 

Technically, such a problem must be present and it takes place because 
the series (4.6.10) cannot converge too well to g(x): if, in fact, it did converge 
absolutely and if it had g as its sum, it would follow g(0) = g(L) = 0, for 
instance, which might be false for a given g. This phenomenon always appears, 
whenever one tries to approximate a function g with functions (in our case 
sin Zx with properties too different from those of g (for instance, g(0) 4 0 in 
general, but all the approximating functions vanish in 0!). 

The upcoming discussion is interesting because it illustrates how it is some- 
times possible to bypass the obstacle just met: it is in fact a type of problem 
that often occurs in mathematical analysis. 

We shall first show that the series in Eq. (4.6.9) converges to some function 
c) on [0, L], continuous and once differentiable term by term. Then we shall 
show that Eq. (4.6.9) also verifies Eq. (4.6.7). 

Finally, and this will be the most interesting part, we shall show that 
Eq. (4.6.9) verifies the Dirichlet problem, Eq. (4.6.2); and this will imply, by 
the regularity theorem, Proposition 1, p.14, that, actually, c® € C™({0, L]), 
although, of course, it may be that c® ¢ C~™((0, LJ). 
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To show that the series (4.6.9) is convergent and once differentiable term 


; d 
by term, we can remark that, setting g' = 3: 


2 7? k -1 2 k J2 
Fh =z | g(a") sin <a! de = | ‘(x") cos aa! 5 


L Jo ak/L L L 
DI IA ick Gen nk pepa 2 ‘ 

Eee is ea) ae 4.6.11 
Zif deo Fe da! = EWO- (611) 
E i k 

B g(a") cos ea! dz PSN are 


This implies, if Mg = maxzejo.1) |g' (x)|: 


Del < —(19(0)| + l9()| + LMg) (4.6.12) 


which means that the series (4.6.9) is uniformly convergent together with its 
derivative series: since w(k)? diverges as k? for k — oo, in fact, such series are 
respectively bounded above by the convergent series [see Eq. (4.6.12)] 


< al < gal] Tk 
— 4.6.1 
> Sh? ada SE L (4.6.13) 


Hence, by the series differentiation theorems, Eq. (4.6.9) converges and its 
derivative can be computed by series differentiation and is a continuous func- 
tion (as a sum of a uniformly convergent series of continuous functions). 

We now show that Eq. (4.6.9) verifies Eq. (4.6.7). Since, as already ob- 
served, the term-by-term limit of Eq. (4.6.4), thought of as a series in k, is 
Eq. (4.6.9), it will suffice to show that such a term-by-term limit is actually 
correct. In other words, it will suffice to show that Eq. (4.6.4), thought of as 
a series in k with all the terms with k > N vanishing, is uniformly convergent 
with respect to a and £. 

We shall show this by dominating the series (4.6.4) by the series 


el 
— 2M if M, = 4.6.14 
3 w? g? l g zelo] |g(x)|, ( ) 


where the terms with k > N are thought to be zero. 
Recalling the form of wp, see Eq. (4.4.17), and using the inequality 


if y € [0,7], (4.6.15) 


we see that if 0 < a < r: 


= 2 a a o T A 2-1 
H H a 
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Hence, Eq. (4.6.14) is a series which is bounded above by the series in which 
wr” is replaced by the right-hand side of Eq. (4.6.16), and this last series is 
dominated by 


4 
YIM (= t E < +00 (4.6.17) 


having removed only in this last step the restriction k < N — 1. This proves 
that Eq. (4.6.4) is uniformly convergent with respect to the parameters a, €, N 
and, hence, Eq. (4.6.7) follows. 

We now must show that Eq. (4.6.9) is a C% ([0, L]) function verifying Eqs. 
(4.6.2) and (4.6.3) with ho = 0, hz = 0. Equation (4.6.3) is obvious since Eq. 
(4.6.9) has been proved to converge (and all its terms vanish for x = 0 or 
x = L). To prove Eq. (4.6.2), we use the fact that, as already remarked, it 
would be obvious if g € C” ([0, LJ). 

Given € > 0, let ge € C ((0, L)) c C” ([0, L]) be a function such that: 


(i) gel) = glæjife <£ < L-e. (4.6.18) 


(ii) z f oe lge(£) — glæ)|dz < €. (4.6.19) 


(iii) The derivative g/ of ge, see Fig. 4.6, is such that 


L L 
| lgz(a)|dx < I \g'(x)|dx + 2M,. (4.6.20) 
0 0 


We leave as an exercise based on Appendix C, p. 521, the proof that such a 
function indeed exists (note that (iii) expresses that ge can be chosen to go 
from zero to g(€) or from g(L— €) to 0 without oscillating too much, i.e., with 
a derivative changing sign once at most without growing too large). 


O Şi L—e L 
Figure 4.6: Approximating a C% ([0, L]) by a Cg°((0, L)) function. 
Then define 


= 2 i I\ os tk Poa —(1) Ge,k E wk 
lk = F : ge(2')sin 7 dx’, © = SEP sin 2, (4.6.21) 
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and, since ge € C” ([0, L]), we already mentioned that 


sedis dTe 


-7 =pge, — T®E(0) = 7 (L) = 0 (4.6.22) 


which implies 


x deWe x deWe x’ @eWe 
=(1)e = 1 Bos da! n 1" 
TE (xr) o TE (x')dx S x’ | i a T2 (x”)dx”] 


dx 
(4.6.23) 
If we show that uniformly in x € [0, L]: 
(1) (De 
Og tme he: 8S e o (4.6.24) 


e—0 dx e>0 dx 
we shall be able to take the limit in Eq. (4.6.23) and obtain 


ce) (x) = (0 + fw f dec" | oe] c- ngele’ ches e (4.6.25) 


implying by assumed continuity of g and by the above proved continuity of c® 
that c“) is twice differentiable and by twofold differentiation of Eq. (4.6.25) 
that it verifies Eq.(4.6.2). 

The regularity theorem of §2.2, Proposition 1, p.14, will then permit us to 
deduce from the fact that c” is twice differentiable with continuous deriva- 
tives and verifies Eq. (4.6.2) that c is in C%((0, L]).4 

Therefore, it remains to prove that the limits of Eq. (4.6.24) are correct 
and uniform in x € (0, L]. 

We already know that c and its first derivative are given by the series 
(4.6.9) and by the sum of its term-by-term derivative. Such series are also the 
limits, term-by-term, of the series in Eq. (4.6.21) and of its derivative series 
because by Eqs. (4.6.19) and (4.6.21): 


[Jek — Gel < 26, VWk>O (4.6.26) 


Hence, the proof of Eq. (4.6.24) is again a problem of exchanging a limit with 
a series summation. 

The necessary uniformity of the limit and the convergence of the series 
follow from the identity: 


t This also follows directly from Eq. (4.6.2) since it shows that the second derivative of 
cl) is continuously differentiable because such are g and c), ete. 
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2 oft k 2 ft k 
Ge,k = f ge(x) sin =" dx = =f g(x) cos =r dx 
fies 7 ‘ (4.6.27) 
-=(/ g'(x) cos Rede + f g(x) cos ae de). 
Tk E L xrgle,L—e] F L 
Hence, by Eq. (4.6.20), 
Baal < 2LMy' +2 he |g (x)|dz|dx 
g —— 
= mk (4.6.28) 

— 4LMy +4M, 
z Tk 


and, therefore, the series (4.6.21) and its derivative series are dominated by 
the series (£ independent and convergent): 

~~ 4LMy +M; nm 4LMy x Mg 
———_—_—— 4.6.2 
5 "OE mag and (4.6.29) 


~ nko(k)? 
k=1 


proving their uniform convergence and, hence, Eq. (4.6.24). 

To conclude the proof of Proposition 12, we still have to treat c\? de- 
fined by Eq. (4.6.5) or by being the unique solution to the equations [see Eq. 
(4.4.28)]: 


(oc? — rDce?). = 0, €=ja,j=1,...,N—-1, 


(4.6.30) 
co? = ho, ca)? = hy. 


Suppose, first, that o > 0. The expression (4.6.5) is not too helpful for 
investigating the limit a — 0,€ — x. We therefore look for an alternative rep- 
resentation for c(? in analogy with the theory of linear differential equations. 

We look for a solution of Eq. (4.6.30) having the form 


e = Boe >i + Bye E99), j=9,...,N (4.6.31) 


where in the second term we use (instead of an arbitrary constant factor 8) 
the constant factor 3,e~>”, still arbitrary because such is 3, but yielding a 
more symmetric expression (in which 0 and L “play the same role”). 
The parameters 6o, 31, are to be determined so that Eq. (4.6.30) is verified. 
Equation (4.6.30) will hold for j = 2,...,N — 2 if 


= (4.6.32) 


which, via a simple discussion, is shown to admit a unique positive solution 


lim 0 4.6. 
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Furthermore, Eq. (4.6.30) for j = 1 or N — 1 says, by taking Eq. (4.6.32) into 


account, 


Bo + Bie *” = ho Bo a ee 
x (4.6.34) 
Boe” + Br = ht A = 


From Eqs. (4.6.31), (4.6.33), and (4.6.34), it is now immediate to take the 
limit a — 0, ja — x. One finds 


eolL—2) (4.6.35) 


which is immediately checked to verify Eqs. (4.6.2) and (4.6.3) with g = 0. 
The case ø = 0 is analogously treated by replacing Eq. (4.6.31) with 


cb” = By + Br ja, (4.6.36) 

and one eventually finds 
c) (x) = ho + + (he — ho) (4.6.37) 
and Proposition 12 is completely proved. mbe 


It is useful to collect all the results of this and the preceding section into 
single statement. 
13 Corollary. Let t —> y(t) be a motion verifying Eq. (4.4.3) with initial 
data 


ue (0) = c$ (0) + wol€), g$ (0) = vo0(6), (4.6.38) 
where uo, vo € © ({0, L]) and c™ is a solution to the discrete Dirichlet prob- 


lem, Eq. (4.4.28). Then the limit 


c) (x) = lim y(t) = e(c) + Tlx, t) (4.6.39) 
Ea 


exists and c E€ C™([0,L]) is the solution to the “Dirichlet problem” 


2 
Soi u zi O e (4.6.40) 


while W € C™([0,L] x R) verifies the wave equations (4.5.7)-(4.5.10) and 
w(-,t) € C([O, L]), YtE R. 
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4.7 Elastic Film. The Dirichlet Problem in N C R? and 
General Considerations on the Waves 


The theory of the oscillations of an elastic film is considerably more complex 
and interesting than that of the elastic string of §4.3-4.6. The results, however, 
are very similar. We shall not enter into the details of a theory that would lead 
us quite far from our program of analysis of the simplest mechanical systems. 

We only give some terminology and formulate for illustrative purposes 
some easy propositions. 

We shall then conclude our introduction to wave theory by defining the 
wave propagation velocity, studying it in the simple case of the elastic string 
subject only to tension forces (o = 0,h = 0, g = 0). 


5 Definition. Let Q C R? be a bounded open connected region with a bound- 
ary ON which is a regular surface (see Definition 10, p.170). Let Qa = 2NZ?, 
and 02, = { set of points of ƏN lying on the intersections between OQ and 
the bonds of the lattice Za}. 

The discrete Laplace operator on Q relative to Z? is defined as the linear 
transformation D associating with every vector 6 = (de)ecn,uan, the vector 
((Dd)e)een, given by 


a Oe — 9eten(€,e)e 
Dô) = — > A Qa, 4.7.1 
( Je - eal€, e) a2 ’ £ E ( ) 
where e = +e1, +e2 (e1, and ez being the two unit vectors parallel to the axes 


of Z2) and, for E € Qa: 


Ea(E,e) ={ distance between E and its nearest neighbor in 


4.7.2 
Na U OQ, in the direction e} ( ) 


The “Z2-discretized” Dirichlet problem in Q with interior data g = (gejeco, 
and boundary data h = (he)ecan, are the equations 


a ðe — T (D)e =g, EE Qa, (4.7.3) 
bg = he, EE OQg. (4.7.4) 


Using the invertibility of positive-definite matrices, Appendix F, p.525, 
the following proposition is checked along the same pattern of the proof of 
Proposition 8,§4.4, p.263. 


14 Proposition. Ifo > 0,7 > 0, the Dirichlet problem [Eqs. (4.7.8) and 
(4.7.4)] always admits one and only one solution for any given boundary and 
interior data. 
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Again, in the same way as in §4.4 and §4.5, one may check the following 
proposition. 


15 Proposition. Given g € C®(R?), h € C*®(92),° consider the mechanical 
system with Lagrangian function [see Eqs. (4.3.13) and (4.3.5)] 


£= >> (Haig + nagle- Sank) 


€EDa 2 
ae a 1 (Ye = Yeten(€,e)e) a 
ce tar Ea(&,e) v(e, £) a? 
ye = h(E), VEEON (4.7.6) 


This mechanical system has one and only one equilibrium configuration y = 
c(, It is described by the solution &™ of the Z?-discretized Dirichlet problem 
with interior data (wg(§))eca, and boundary data (h(E) )ecan,- 


Observation. More generally, if one is not interested in the limit a — 0 the 
conditions , g € C®(R?), h € C®(OM) can be replaced by g = g(E)een,, and 
h = (he)ecan,- 

Difficulties arise when one wishes to study the a — 0 limit. Basically, one can 
say that the difficulties are due to the impossibility of providing the eigenvalues 
w?,w3,... and the respective eigenvectors n,n), ..., describing the normal 
modes of the system of Eqs. (4.7.5) and (4.7.6), in a very explicit way, as in 
the case d = 1. Hence, the theory has to be developed in a somewhat more 
abstract way. 


An example of a result that should be possible to obtain is as follows. 


16 Proposition. The stationary solution c™ of the equations for the me- 
chanical system of Eqs. (4.7.5) and (4.7.6) with g € C®(R?), h € C (8N) 
is such that the limit 


lim ef”) = c), s € 2, (4.7.7) 
a—0 


exists and defines a function c € C°(0Q) such that 


a c(x) — T Ac(x) = g(x), xEN, (4.7.8) 
c(x) = h(x), x € ON, (4.7.9) 
where A f(x) = o Ff (x), Vf e C%(2Q). Furthermore, Eqs. (4.7.8) and 


(4.7.9) have a unique solution in C® (Q). 
The motions t + y(t), t€ R, of the above mechanical system, fulfilling the 
initial conditions 


5 See footnote 1. 
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yf (0) = Hult), EER (4.7.10) 
gf (0) = volé), EE Qa, (4.7.11) 


with ug, vo E CE (2), are such that the limit 


lim yg” (t) = u(x,t),  x,t€QxR (4.7.12) 
¿>x 


exists and defines a C®(N x R) function. Furthermore, setting 


w(x, t) = c(x) + w(x, t), (4.7.13) 
it is 
= = Ow 
jo (x, t) — rT AW(x, t) + Uae % t) =0, (4.7.14) 
2— 
T(x, 0) = u(x), rx, 0) = vo(x) (4.7.15) 
(x, 0) =0, Ex, t)=0, YxEIN, YEER (4.7.16) 


Finally, there is a family of functions 5”) c€ C>(R), h =1,2,..., vanishing 
on ðN and a sequence W(h), h = 1,2,..., of positive numbers such that 


= F5 9x) h) cosū(h)t + TOL sinh), (4.7.17) 
h=1 
where 
(h) = [| S(x)uo(x)dx, (h) = se S (x) v9 (x)dx, (4.7.18) 
Q R 


and the series Eq. (4.7.17) converges, YX € 2,VtER. 


Observations. 

(1) The analogy between the vibrating string and the vibrating film would 
then be essentially complete. However, this author does not know if there is 
a proof of Proposition 16 (admitting its truth) in the above generality. 

(2) There is a case in which an obvious variation of the above proposition holds 
and its proof is very simple. It is the case in which 2 is a torus (i.e., 2 is a 
“bicycle tire”) and ø > 0. Mathematically, this is the system associated with 
the Lagrangian that follows; let N = L/a = integer, Qz = [0, L—a] x [0, L— a]: 
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Lper Spa? X` Gi + 9(€)ye) 


ECQLNZ2 
E í j2 (4.7.19) 
T 9 DL EMD Yé — Yé+ae 
ae. Se, a = a 
EEQLNZ2 E€QLNZ2 e 


and in the last sum the points which do not belong to Qz N Z2 and which 
correspond to the points adjacent to the boundary Qz have to be identified 
with the points on OQ, opposite to them. 

In other words, Qz N Z? is thought of as a “discrete torus” and the film 
looses its boundary, becoming a “tube”. 

The theory of Eq. (4.7.19) is identical to that of the vibrating string. 
Actually it is technically even easier (and analogous to Problem 9, §4.5, on 


the vibrating string). The role played by the functions 4/ £ sin(=* ja) in the 
vibrating-string case is now played by 


1 axis, F 
P E EE: (4.7.20) 


with (j1, j2) € Z?, integers. The w? is now replaced by 


2rhi q 1 — cos tza 
+ — M 


2 >) 


2 a 7, 1—cos 

Whiho = u + Ta az 

while the role of Lemma 11, §4.5, is simply played by the two-dimensional 
Fourier theorem. 

The detailed development of the theory of the motion of Eq. (4.7.19) (and 
of the analogous one-dimensional system, Problem 11, §4.5) is a very useful 
exercise. The reader will however realize that the assumption ø > 0 cannot, 
in the case of such periodic boundary conditions, be replaced by ø > 0 (which 
is the physical meaning of this?) 

To conclude our analysis of the ordered systems of oscillators, we define 
and study concisely the notion of velocity of wave propagation. 


(4.7.21) 


a 


6 Definition. Let 2 be an open region with regular boundary |dpr Q, Q C RÌ, 
d=1ord=2. 
Consider the wave equation in Q for we C™~(Q x R): 


3w 
ðw 
w(x, t) = 0 = a & t), xLEN (4.7.23) 


with initial data 


w(x, 0) = uo(x), (4.7.24) 
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Ow 
OE (x, 0) = vo(x) (4.7.25) 


and suppose uo, vo E CE (2) and vanishing outside a neighborhood with radius 
€ around Xo E€ 22. Given x, E€ 2,x1 Æ Xo, let 


t-(xo,X1) = inf {inf of the set of the values t for which there 
Uo ,VvO 


(4.7.26) 
ist’ <t, t >0, when w(x1,t’) £0. 
Obviously, te(X0, X1) > te (X0, X1) if € >, and 
t(xo, X1) = sup te (Xo, X1) (4.7.27) 
e>0 


is the “minimum time” needed for a perturbation of the equilibrium, (i.e., 
flat), string, or film, initially located around xo to “reach” xı. 

The “wave velocity” of the waves described by Eqs. (4.7.22) and (4.7.28) is 
naturally defined as 

[x1 — Xo| 


C = sup : 
x14X0 t(Xo, X1) 


(4.7.28) 
Observation. In the d = 2 case, we did not prove existence and uniqueness 
theorems for Eqs. (4.7.22)-(4.7.25), while for d = 1 we did. However, if we 
set te(X0, X1) = +œ if for every (uo, vo) there is no solution to Eqs. (4.7.22)- 
(4.7.25) and if, in case of non unique solutions, we take into account all the 
solutions in the infimum in Eq. (4.7.26), the above definition also makes sense 
for d = 2. 

In any case, this is not a real problem since existence and uniqueness for Eqs. 
(4.7.22)-(4.7.25) for uo, vo E€ C%(2) can be proved in a satisfactory sense. 


Let us prove the following proposition for o = 0: 


17 Proposition. Let d = 1, 2 = (0, L). The wave propagation velocity of the 
waves described by Eqs. (4.7.22) and (4.7.28) with o > 0,7 >0,u> 0 is 


C= (4.7.29) 


= 
u 
independent on the value of o. 


PROOF. (Case o = 0 only). From Eq. (4.5.19), we derive by trigonometry: 
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wa th yn wh lh) . ah 
w(x, t) =e T7 [aoth cos zl + Cah sin Tcr] 
< To(h h h 
= a ) [sin (e + Ct) +sin e — Ct) (4.7.30) 
h=1 


2 olh) wh th 
+> [cos (x + Ct) + cos (ax — Ct)], 


th 
er Oa 
since W; = C?(44)?. Then, let Vz € R, 
* saN _ wh x Sat, . Th 
ulz) = > Uo(h) sin Th vlz) = > Vo(h) sin TP (4.7.31) 


and, by the Lemma 11, p.266, plus the periodicity and parity properties of 
the sine: 


(i) ugl(æ) = uo(@), v(x) = volz), Vee [0, L], 
(ii) ug(L +2) = —uo(L — x), võ(L + x) = —vo(L — x), Vx € [0, L], 
(4.7.32) 
(iii) up, v6 are periodic C™ functions with period 2L, i.e., uğ, v are obtained 
from uo, vo, by first reflecting them about L and then by periodic continuation 
of the function on [0, 2L] thus constructed. 
If uo has support in a neighborhood with radius caround xo, one finds that 
uo is described in Fig. 4.7. 


Figure 4.7.Example of graph of ug: 


Equation (4.7.30) can be written in terms of uj, vj as 


a+Ct 
w(a,r) = © uk (a + Ct) + ug(a — Ct)] — =f vg (E)dE (4.7.33) 
2 2C x—Ct 
[see, also, Problem 11, §4.5, for an alternative proof of Eq. (4.7.33)]. 
Then, for instance, one sees from the picture that in order that uġ(£x—Ct) # 
0 the point xı — Ct has to fall inside some of the intervals where uğ is not 
zero; hence, t must be such that 
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|v — zo| +e |v, — zo| — € 


>t> 4.7.34 

C aii C ( ) 

and a similar bound on t can be found likewise, discussing the the third terms 
in Eq. (4.7.33). This clearly implies Eq. (4.7.29). mbe 


4.8 Anharmonic Oscillators. Small Oscillations and 
Integrable Systems 


Consider an ¢-degrees-of-freedom system with Lagrangian function 


e 
P= D Fal) Bib; - V(B) (4.8.1) 


where g is a given C® (R£) positive-definite ¢ x £ matrix and V € C® (Rf) isa 
given potential energy function. Assume that V has a second-order mininum 
in Bo € Rf; i.e., 38V (6) = 0 in Bo and that the matrix 


02V 
08,08; 


is positive definite. Then 6o is an equilibrium point. 


lig = ——— (bo)  ij=1,...,£, (4.8.2) 


7 Definition. The “small oscillations” near Bo of the system described by 
Eq. (4.8.1), with V verifying Eq. (4.8.2), are the motions of the mechanical 
system with Lagrangian function 


Lsmall (B, 6) = i Jij (Bo) Bibi — Ly Ii; (8 — Boi) (Bj a Boj) (4.8.3) 


rm em 


where B = ((31,..-, Be), Bo = (Go1,---; Boe). The normal modes pulsations of 
Eq. (4.8.8) are called the “proper pulsations” of Eq. (4.8.1) near Bo; their 
reciprocals multiplied by 27 are the “proper periods”. The reciprocals of the 
periods are the “proper frequencies” of the small oscillations. 


Observations. 

(1) Therefore, the small oscillations are the motions of the Lagrangian system 
obtained by replacing the matrix g with its value at the equilibrium point Bo 
and by replacing the potential energy V by its Taylor expansion about (po 
truncated to second order: 


V(@) = V(Bo) + 5 5 Iij (Bi — Boi) (B; — Boj) (4.8.4) 


ij=1 
and in Eq. (4.8.3), V (Bo) does not appear since it does not affect the associated 
motions. 
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(2) On the basis of the above definitions, the “small oscillations” are not 
necessarily motions with small amplitude. However, one can expect or hope 
that if the energy of a motion described by Eq. (4.8.1) is just slightly above 
V(@o) (hence, the motion takes place in the vicinity of Bo when it is initially 
there), then the motion of Eq. (4.8.3) with the same initial data approximates 
well the exact motion. 


Since the small oscillations are, by definition, harmonic motions, hence 
“simple motions”, one understands the interest in the following question: in 
what sense do the small oscillations approximate the real motions of Eq. 
(4.8.1) near Bo? 

In Chapter 2 we met and essentially solved this problem for systems with 
one degree of freedom. The generalization to systems with £ > 1 degrees of 
freedom is, however, surprisingly difficult and interesting. In Chapter 5 we 
shall discuss some of its aspects. For the moment we shall only provide a 
definition of a class of systems behaving “as if they were linear oscillators” 
and we shall continue by discussing a few remarkable examples of such systems 
warning the reader, however, that it should not be hoped that Definition 10 
to follow is a definition covering many cases. 


8 Definition. Consider a system of N point masses in RÌ subject to ideal 
bilateral constraints with L degrees of freedom and to a conservative force. 
We assume that the equations of motion are normal in the future as well as in 
the past; i.e., they admit a global solution t + S,(x,x) for every initial datum 
(x,x) compatible with the constraints. We shall call “space of the initial data” 
the set S C R?N@ of all the pairs (x,x), where x is a constraint compatible 
configuration and x is a constraint-compatible velocity. 

We define on S the “time evolution flow”, (St)ter, as the group of transfor- 
mations mapping (x,x) into S;(x,x) = (datum into which (x,x) evolves in 
the time t according to the equations of motion). 


Observations 
1) This generalizes the initial data space, introduced in §2.22, to constrained 
g p ti ? 
systems. 
(2) S will be considered to be a surface in R?N4, The geometric structure of 
S is very simple as expressed by the following proposition. 


18 Proposition. The surface S of the preceding definition is a regular surface 
in RANG, 

PrOoF. By Definition 10, §3.6, p.170, given (Xo, xo) E€ S, we have to find a 
neighborhood W of (Xo, Xo) on which it is possible to establish a local system 
of regular coordinates adapted to the surface S. 

Let U be a neighborhood of x9 on which it is possible to establish a local 
system of regular coordinates € = &(@), with basis 2, adapted to the surface 
X in RN? defined by the constraint. The set U exists by the very definition 
of an ¢-degrees-of-freedom holonomous constraint. 
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In this coordinate system, the possible velocity vectors B for the system 
which are compatible with the constraints are those such that 


bı = 2 =... = ÊNa—e = 0, (4.8.5) 


while the possible position vectors 8 are those for which 


By = Bo =... = Bnd_e = 0, (0,...,0, Gna—eq1,---;BNna) ER. (4.8.6) 


Hence, the correspondence between RN@ x Q and R?N4 described by 


WO = Z Mos 
x =) f Oi (3), x" = B() (4.8.7) 


establishes on the image W C R?%4 of RN4 x N a coordinate system near 
(Xo, xo) E€ W adapted to S with basis RN? Q, and it is easily checked that the 
Jacobian determinant of this coordinate change at the point with coordinates 
(b, B) is the square of the Jacobian determinant in 8 of the transformation 


Æ. By the regularity assumption, on the coordinate system (U, Æ), such a 
determinant does not vanish. mbe 


Observations. 
(1) The above proof shows that setting 


, . wife . 
k =(8na—e41,---; Na) ef (K1,.-., Ke), 


k =(Ona—e41) ++») Ona) =! (1, +++ Ke); 
Eqs. (4.8.7) establish a coordinate system, (&,«), for the points of W N S, 
where W is the image via Eqs. (4.8.7) of R4 x Q. Furthermore, as (x,x) 
varies in W N S, the point (k, K) varies in Rf x V where V is an open convex 
set in RI (as V = R N {plane (1,...,38Na—c = 0}, and 2 is convex). 
One refers to this remark by saying that the data space S of a system with £ 
degrees of freedom locally has the structure Rf x V with V C R”. 
For this reason, and with an abuse of notation very useful and widely used, 
one often denotes the points of S as (&,«), where (&,«) are local regular 
coordinates (which have to be deduced from the context and which often are 
really local (i.e., non global) coordinates), in a neighborhood W of a point in 
S such that W N S has the structure Rf x V. 
Coherently, the Lagrangian of the constrained system is described as a func- 
tion L(K, k) of (k, k). 
(2) Since S is a regular surface, it makes sense to define the open sets on S 
and the space C®(S). A set E € S is open on S if it is the intersection of 
an open set in R? with S. A function f is in C®(S) if its restriction to a 
neighborhood U, on which it is possible to set up a local system of regular 
coordinates transforming U into Rf x V, has the property that, if thought of 
as a function of the local coordinates (%, K), it is a C°(R* x V) function. 


(4.8.8) 
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9 Definition. Let S be the initial data space for a system of N point masses 
with L degrees of freedom subject to ideal holonomous constraints and to con- 
servative forces.Let A € C®(S) be a real-valued function on S. We shall say 
that A is a “prime integral” (or “first integral” or “constant of motion”) for 
the motions t > S;(x,x) = (x(t), x(t)), tE R, if 


A(x(t), x(t)) = constant (4.8.9) 
for all (x,x) € S. 
Examples 
(1) The energy 
E(x,x) = 1 Somi)? + VO (xD, x0) (4.8.10) 
2 


is a typical example of a prime integral. Often it is the only prime integral admitted by the 
system’s motions. 
(2) If the system is isolated, i.e., subject to zero external forces, the d components of the 
linear momentum 


N 
Q(x, x) = Som 2 (4.8.11) 
i=1 


are also prime integrals when the third law of dynamics holds. In the same situation, the 


angular momentum components also give rise to prime integrals. 


19 Proposition. A system of £ harmonic oscillators with Lagrangian func- 
tion (4.1.2) admits £ prime integrals Ay,..., Ae given by Eq. (4.1.5). Further- 
more, it is possible to parameterize the initial data space S through the values 
Aj,...,Ae and a point p E€ T’, p = (1,.--, pe), so that S can be thought of 
as the product [0, +20) x T*, and the motion t — (x(t), x(t)), t € R4, of the 
system is described, in these coordinates, as 


(A, a , A; P1; <, pe) i (A, ane ., Ags p1 + wit, BERI pewet), (4.8.12) 


where w1,...,We are positive constants. 
Finally, the correspondence (A, p) — (x, x) is a C® invertible nonsingular 
correspondence between (0, +20) xT" and the subset of R” which is its image. 


Proposition 19 suggests the following definition. 


10 Definition. Let S be the initial data space for a system with € degrees of 
freedom subject to ideal constraints and to conservative active forces. 

We shall say that the system is “integrable” on the open region W C S if on 
W it is possible to define £ prime integrals A = (Aj,..., Ae) and £ T*-valued 
C™(W) functions p = (y1,..-,~e) such that: 

(1) The image of W under the map 


6 See definition 13 and related observations, p.101, for the meaning of the derivatives. 
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(x,x) > I(x,x) = (A,¢) (4.8.13) 


has the form V x T*, where V is an open set in R! and the correspondence I 
between W and V x T° is an invertible nonsingular (i.e., with non vanishing 
Jacobian determinant) correspondence. 

(2) There are £ real C® functions on V, A > w(A) = (w1(A),...,we(A)) € 
RE such that if t — x(t) is a motion with initial data (x(0),x(0)) € W, then, 
Vte R4, (x(t), x(t) EW and 


I(&(t), x(t)) = (Ao, po + w(A0)t) (4.8.14) 


where Ag = A(x(0), x(0)), po = y(x(0),x(0)), and p > p +w(Ao)t denotes 
the quasi-periodic flow on T! with speed w( Ao), see Definition 1, p.248.. The 
numbers w;(A),T;(A) = tay? yi(A) = TA)? i = 1,...,l, are, respectively, 
called the pulsations, the periods, and the frequencies of the motions in W 


with amplitudes A. 


Observations. (1) In the case of a system of harmonic oscillators, there are 
various choices of W for which the system is integrable on W: the most natu- 
ral one takes W to be the set in S whose image under the map of Eq. (4.1.15) 
is (0, +00)f x T* (i.e., the set of data having all the normal modes excited: 
Ai >0fori=1,...,2). 

(2) One can interpret Eqs. (4.8.13) and (4.8.14) as saying that the data 
space W of an integrable system is “foliated by an ¢-parameter family of 
é-dimensional invariant tori”. The parameters are the values of the Z prime 
integrals. The torus with parameters A € V is the set [({A} x T°) image of 
{A} x T! under the “integration map” J. 

(3) In the case of harmonic oscillators, w(A) is A independent: “isochrony of 
the harmonic oscillations”. As seen in the case £ = 1, §2.10, it is obvious that 
this should be a very special property of the harmonic oscillators. Therefore, 
it is better not to introduce it into the definition of integrable system, to avoid 
giving a too restrictive definition. 

(4) In the context of the theory of small oscillations, the above definition 
seems especially designed to formulate the conjecture that in a small enough 
neighborhood W of an equilibrium position (0, Bo) for a mechanical system 
described by a Lagrangian (4.8.1) verifying Eq. (4.8.2), the system is inte- 
grable. 

Such a conjecture, true if £ = 1, is generally false if £ > 1; i.e., there may be 
motions which stay indefinitely close to an equilibrium point and, neverthe- 
less, move in a fashion substantially different from a quasi-periodic motion. 
However, a conjecture similar to this one is true. We shall discuss this matter 
in Chapter 5, §5.9-§5.12. 

(5) To establish the integrability of a system with £ degrees of freedom, one 
usually proceeds to show that it is possible to describe the motions which 
develop in W in terms of 2l parameters (A,y) € V x T° and of N C®- 
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functions on V x 7“, Y,..., 6), such that if t > x(t) is a motion, one 
has, Vi =1,2,...,N 


3 


x(t) = PO (A1,..., A, pı turt,..., pe +wet), (4.8.15) 


where wı(A),...,we(A) are £ C™-functions of A € V. 

Successively, one proceeds to check the invertibility, regularity, and non singu- 
larity of the map x(0), x(0))— (A, p). This check is usually an easy matter 
and without direct interest once Eq. (4.8.15) has been established for all the 
motions in W. Actually, the true analytic difficulty that is met in the intregra- 
bility proofs lies in the proof of the validity of a consequence of Eq. (4.8.15): 
precisely, in checking that all the motions in W are “quasi periodic” in the 
sense that their coordinates depend quasi-periodically on time (see §2.21 for 
the notion of quasi-periodicity). See, however, Problem 20 to §4.15. 


Therefore, in the upcoming sections, we shall often stop our analysis of 
integrability when we find that the motions taking place in a given W are 
quasi-periodic, without entering into the sometimes long analysis necessary 
to prove the invertibility and smoothness properties required by integrability. 

The following extension of Definition 10 is natural in the context of the 
concepts of analytical mechanics of §3.11 and §3.12. 


11 Definition. Let L € C®(W),W C R* or W C (Rf xT) or WC 
RE x (RE x T), L + bo = £L be a time-independent regular Lagrangian on 
W (see Definition 14, §3.11, p.211). 
We say that L is integrable on the data space W if there is an integrating 
map I transforming W into V x T° enjoying the properties (1) and (2) of 
Definition 10, where the motion t > x(t) is now a solution to the Lagrangian 
equations relative to L. SS = 
Similarly, if H € C°(W),W C R” or W CRE xT? or W C RE x (R° x 
T), L +b = l, is a regular time-independent Hamiltonian function on 
the phase space W, we say that H is integrable on W if the corresponding 
Lagrangian function L is integrable on the data space subset W = Æt (W), 
= being the map inducing the Legendre transformation between H and L (see 
§3.11). 

In this case, if I is the integrating map for L, the map 


I(p,q) = I(=7'(p,a)) (4.8.16) 


maps W onto V x T“ and it is called_an “integrating map” for H. 

If I is a completely canonical map of W_ onto V x T we say that H is “canon- 
ically integrable” on the phase space W. 

If H is analytic’ on W and I is also analytic, we say that H is “analytically 


7 Analytic means “having convergent Taylor series” near every point of the domain of 
definition, see Definitions 13,14 and 15, §4.13, p.336. 
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integrable on W”.If H is analytic and if I is an analytic completely canoni- 
cal map of W onto V x T‘, we shall say that H is “canonically analytically 
integrable” on W. 


Observations. 7 
(1) If H €e C®(W) is canonically integrable and iff is a completely canonical 
map integrating H, then 


H(I~*(A,)) = h(A) = (¢-independent function) (4.8.17) 
and w(A) = (ŁA)... ZE(A)): 
w(A) = ona) (4.8.18) 


In this case the variables (A, y) are called “action-angle” variables and are 
canonical variables. 

(2) It turns out that all the systems that we shall consider in the upcoming 
sections are analytically canonically integrable on vast regions of phase space. 
However, this will not always be explicitly checked and it will be left to the 
reader, in the problems, to draw this conclusion from the properties discussed 
in the text. 

(3) In an obvious way, one could also define the notion of a Lagrangian an- 
alytically integrable on some set W in the data space. The corresponding 
Hamiltonian system would then be analytically integrable on the correspond- 
ing phase-space subset W and vice versa. 


4.8.1 Problems 


1. Given w € Rf and g E€ C*(T®), suppose that Vv € Z£, v £ 0, it is |w-v|—! < Clv|®, for 
some C > 0,a > 0. Show that the system on Rf xT with Hamiltonian (A, p) = A-w+g(y) 
is integrable and find an expression for @ prime integrals. Show that this is an isochronous 
system. Note that the equations of motion can be solved explicitly for general w. (Hint: 
Write the equations of motion and solve the one for the A’s by developing g into a Fourier 
series g(~) = ye ze Jve”? before integration. The prime integrals can be chosen 


cil P 


B=A+ So vp 
vO 
vez! 


and the condition on w is required to insure the convergence of the series.) 


2. In the context of Problem 1, show that if g is a trigonometric polynomial (i.e., it has 
finitely many non vanishing Fourier coefficients), then the results of Problem 1 hold under 
the sole assumption that the components of w are rationally independent. 


3. In the context of Problem 1, suppose that there is vo € Z®, vo Æ 0, such that w -vo = 0. 
Show that the Hamiltonian system with Hamiltonian H(A, p) = A - w + ecos(vo - ẹ) is not 
integrable. (Hint: Show that its motions are not quasi-periodic.) 


4. In the context of Problem 1, suppose that |w: v|7! < Cly|%, Vv 4 0 and r ¢ No, where 
No is a subset of 2’. Suppose also that g € C% (T®) is such that Jv = 0, Yv € No. Show 
that the Hamiltonian H(A, p) = A- w + g(ẹ) is integrable on R x T°. 
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5. Show that the integrability in Problems 1, 2, and 4 is analytical and canonical (Hint: 


A’- pet vo 9v eS es @(A’,) is a generating function for the integrating map.) 


4.9 Integrable Systems. Central Motions with Non 
vanishing Areal Velocity. The Two-Body Problem 


The best-known integrable mechanical system consists, perhaps, of two point 
masses with masses m1, M2 > 0 interacting through a conservative force with 
potential energy V depending only on the distance between the two points: 


V(E1, £2) = V(|€1 — £21); (4.9.1) 


and we shall assume that the function 9 — V(g) is defined for ọ > 0 and that 
it is a C™-function such that 


lim 0?V (o) = 0, inf V(e) =—-V-, Ve>O0 (4.9.2) 
o—0 >E 
Note that V (0) is undefined, and this means that we shall only consider mo- 
tions t > (x1ı(t), x2(t)), t E R4, such that |xı(t) — x2(t)| > 0, t E Ry. This 
restriction will be imposed via the condition of non vanishing areal velocity. 

If t > (xı (t), x2(t)), t E R4, is a motion of the system, the two points will 
move so that the total linear and angular momentum will be conserved. In 
fact, the force generated by Eq. (4.9.1) is easily seen to verify the third law of 
dynamics, so that the cardinal equations hold and imply the above mentioned 
conservation laws. 

Hence, the center of mass G moves in a uniform rectilinear fashion and, 
possibly by changing reference system, it may be supposed that G coincides 
with the origin O of the reference system (O;i,j, k) in which motion is studied. 
In this situation, the motion t > (x1 (t), xo(t)), t € Ry will be such that 


MX (t) = —məX2(t), Vte R4 (4.9.3) 
and to determine the positions of the two points it will suffice to give the 
def 
vector @ = X2(t) — xı (t): 
m 
x(t) = -— = — elt), x(t) = —=— e(t), (4.9.4) 
mı + mə mı + mə 


Since the angular momentum with respect to O is a constant vector K, it can 
be assumed, without loss of generality, that K is parallel to k: 


K = Ak. (4.9.5) 


Only motions for which A > 0 will be considered and it will be seen that A is 
proportional to the areal velocity. From Eqs. (4.9.3)-(4.9.5), it follows that 
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K = Ak =m x, AX + m2X2 A X2 = m1X1 A (x1 — X2) = ——— on b. 


(4.9.6) 
i.e., ọ and @ must both lie in the plane (i,j). Therefore, the motion t > 
o(t), t € Ry, takes place on the plane (i,j). Recalling the considerations in 
§3.4 about the constraints, we can find the equations of motion by parame- 
terizing the motion by the polar coordinates (@,@) of o in the plane (i, j)and 
then writing the Lagrangian equations for the Lagrangian 
1 


; ee 
L= sm xi + sax — V(|x1 — Xa|) (4.9.7) 


computed on the motions, parameterized as above. 
For such motions, Eq. (4.9.7) becomes 


- A my m2 2. m2 my 2. 
L(ò,0, 0,0) = DaT g? + a ene) o? — V (o) 
2 1 Mı mə 


(4.9.8) 


1 mim. .9 


a — V (o) (6? + 6°60?) — V(o), 


~ 2m +m ~ 2m +m 


where the well-known formula expressing the square velocity @2 as 62 + 026? 
in polar coordinates has been used together with Eq. (4.9.4). Equation (4.9.8) 
yields the following proposition: 


20 Proposition. The theory of the motion of two point masses, with masses 
m1,m2 > 0, under the action of a mutual central conservative force with 
potential energy given by Eq. (4.9.1) is equivalent to the theory of the motion 
of a single point mass with mass m: 
mim 
m= ——— (4.9.9) 


mı + Mə 


moving on a plane under the action of a conservative force, centrally acting 
on the mass from a point O in the plane, with the same potential energy V. 


The motions described by the Lagrangian function (4.9.8) and such that 
A Æ 0 are called “central motions”. 


21 Proposition. The motions of the mechanical system described by Eq. 
(4.9.8) admit two prime integrals: 


1 . 
E = 5m? + 0° + V(0), (4.9.10) 


A= PÀ, (4.9.11) 


and, if A #0, they indefinitely stay away from the origin at a distance greater 
than some time-independent positive quantity (A and E independent). 
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Proor. Equation (4.9.10) is the total energy and, by Eq. (4.9.6), A = mo? 
the angular momentum along the z axis. Hence, E and A are both prime inte- 
grals. Note that 076 is twice the “areal velocity”, i.e., twice the area spanned 
by o per unit time. 

By substituting Eq. (4.9.11) into Eq. (4.9.10) it follows 


1 
E= so + 06") + V(e) (4.9.12) 
and Eq. (4.9.2) implies the existence of g9 > 0 such that E — méz —V(p)< 0 


for o < oo. So o(t) > 00, Yt E Ry. mbe 


Let t — (o(t), A(t)), t € R4} be a motion associated with Eq. (4.9.8) with 
A > 0. Write the equation of motion for ọ by considering the Lagrangian 
equation relative to Eq. (4.9.8) and corresponding to the coordinate o: 


mö = mob? — X, (4.9.13) 

By Eq. (4.9.11) the latter relation becomes 

A? av OVA 
Fe eee (pea 4.9.14 
Og aC Jo (0) ( ) 
where 
mA? 

Va(o) ae eo + V(e) (4.9.15) 


showing that the @ coordinate evolves in time as the abscissa of a mass m on 
a line, subject to a conservative force with potential energy V4. 

Since the motion, by Proposition 21, is such that o(t) > oo > 0, we can 
ignore the singularities of V and V4 in @ = 0 and we can also ignore the 
constraint @ > 0 due to o being the polar radial coordinate, so that the theory 
of Chapter 2 for conservative C' forces acting upon one-dimensional systems. 


22 Proposition. Let o — V (o), o > 0, be a C~((0,+00)) function verifying 
Eq. (4.9.2). Let W be the open set, in the data space of the system described by 
Eq. (4.9.8), consisting of the data with E and A in Eqs. (4.9.10) and (4.9.11) 
such that 


(i) A>0. (4.9.16) 


(ii) The equation Va(o) = —> + V(e) = E (4.9.17) 
admits just two solutions o— (E, A), 04(E,A) such that o} (E, A) > o_(E, A) 
and V4 (94 (E, A)) # 0. 

Then the system is integrable in a neighborhood of every point in W and has 
two periods, see Definition 10, §4.8, p.287, given by 
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T(E, A) =2 de , (4.9.18) 
e 2(E—Va(o)) 
0+ do 
Qn Je- JZB-Valey 
T(E, A) a A O+ do as , (4 9 19) 


where 0+ = 04 (E, A) and o- = o- (E, A). 


PROOF. In the course of the proof we shall state that some functions are C°’, 
leaving the proof to the reader. Let (60,90, 00,90) E€ W be an initial datum 
with energy E and areal velocity 4 and consider the solution of Eq. (4.9.14), 


t> R(t, E, A), “eRe (4.9.20) 
with initial datum 
R(0, E, A) = o-(E,A),  R(0,E,A)=0. (4.9.21) 


By the theory of one-dimensional motions, §2.7, the function R is a C% func- 
tion periodic in t with period 


(E, A) = =2 f" (4.9.22) 
4/ NEN (E — Valo 
where o+ = o4 (E, A) 
If to(@o, 60) is the shortest time such that 
R(to, E, A) = 00, R(to, E, A) = 60, (4.9.23) 
necessarily existing by our assumptions on W, it follows that 


To complete the analysis of the motion, it is necessary to determine @(t). Using 
Eq. (4.9.11): 


t 
A 
6(t) = 0 +f SEEE 4.9.25 
W j 0 R(t’ + to(00, 60), E, A)? ( ) 


and remark that the integrand function in Eq. (4.9.25) is a C™ periodic 
function of t with the period of Eq. (4.9.22), since such is R and also R > 
o- (E, A) > 0. Then by the Fourier theorem, if Ti = T(E, A), 


= X xn(A, Ee TM, (4.9.26) 


RG, E, A)? y 
keZ 
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where (Xk)kez are the Fourier coefficients of É. They vanish as k — oo faster 
than any power in k. Inserting Eq. (4.9.26) into Eq. (4.9.25), it appears that 


ikt E s 
A(t) = bo + xo(A, E)t + XO xr (4, E) ae Mert) (4.9.27) 
kez “TT 


k#0 


which we shall write as 


O(t) = 0o + xo(A, E)t + S(t + to(00, 60), E, A) — S(to(00, 60), E, A) (4.9.28) 


where 
def on 
S(t, E, A) SO xn(A, E) (4.9.29) 
kez Ty 


is a C™-function, periodic with period Tı = T, (E, A). 
It is then clear that the coordinates of o(t) have the form of Eq. (4.8.15). 
For instance, if @(t) = (@1(t), g2(t) € R?: 


oi(t) =o(t) cos 0 (t) = R(t + to) cos(0o + xot + S(t + to) — S(to)) 
=R(t + to) ( cos(6 + xot) cos(S(t + to) — S(to)) (4.9.30) 
— sin (ĝo + xot) sin(S(t + to) — S(to))), 


where the dependence on the E, A, 00, ġo variables has not been explicitly writ- 
ten. By Observation 4 to Definition 10, p.288, this shows the integrability of 
the system and that the two periods are T, (E, A) and T2(E, A) = = xo(A, E). 

It is also easy to find explicitly the integrating transformation I: the prime 
integrals are E and A, the angles (p1, 2) € T? are, for instance, by Eqs. 
(4.9.24) and (4.9.28), 


2T 


y1(60; 90, 20, 00) = TE, ge 00); (“average anomaly”), (4.9.31) 


92(60, 90, 00,90) = 9 — S(to(00, ġo), E, A) (“average longitude”), (4.9.32) 


and the respective periods are, as already mentioned, Tı (E, A) and T2(E, A) 
[see Eq. (4.9.28)]. 
Regularity and invertibility of the transformation I on suitable neighbor- 
hoods of the trajectory starting in (60, 60, 00, 8o) will not be explicitly checked. 
It remains to check Eq. (4.9.19). Again we do not write explicitly the E 
and A dependence in the functions o_(E, A), o+(E, A), xo(A, E), Tı (E, A), 
R(t, E, A), S(t, E, A). By the Fourier theorem, 
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Ti 
T PELA 2 fe A 
= — —— dt = — —— dt 4.9.33 
YSR), PTR), HO ER 
because R(t) behaves specularly when t varies from 0 to 4 or from 4 to 


Tı, (i.e. when R varies between o— and e+ or between 0; and o_). But for 
t € [0,4], 


R(t) d 
= / poe (4.9.34) 


by Eqs. (4.9.15), (4.9.20). Hence, changing variables “t — R”, via Eq. (4.9.34), 
it follows that 


d 

er (4.9.35) 
Z(E —Va(R)) 
and this implies, from Eq. (4.9.33), that 
2A fe dR 
x0 = i, ee} (4.9.36) 
tJe- R ZŒ- VA) 

mbe 


Observation. If we regard Eq. (4.9.8) as defining a three-dimensional problem 
with Lagrangian 


LQ, e) = sme? —V(o) (4.9.37) 


it follows, of course, that under the same assumptions as in Proposition 22, 

the system is integrable. Now the prime integrals will be Æ, A and the angle 

of inclination 7 of the orbital plane with the reference (i,j) plane. The third 

angle will be the longitude in the (i,j) plane, counted from the i axis (say), 

of the intersection of the orbital plane with the (i,j) plane (“nodes line”). 

However, the third angle thus defined remains constant over time. This means 
2T 


that the pulsations in these coordinates will be w = T w2 = p W3 = 0. 


4.9.1 Problems 


1. Let m = 1 and consider the motions associated with the Lagrangian (4.9.8) under the 
assumptions of Proposition 22. Following the idea of Problem 4, p.227, and substituting L 
for A in that problem, define 


Suppose that this relation between L, E, A can be inverted with respect to E, for E, A in 
some open set V, in the form E = e(L, A) so that 
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L= X(e(L, A), A) 
with e of class C%. Show that if E = e(L, A), 


2 O 2 te) 
ae = Æ(L,A), T AY 
Tı(E,A) OL Tə(E,A) OA 
(Hint: Note that 1 = gn - ge and then use Eqs. (4.9.18) and (4.9.19) remarking that the 
derivatives with respect to the integration extremes vanish as, by the definition of o_, 0+, 
the integrand vanishes at the extremes.) 


2. In the context of Problem 1 the Hamiltonian corresponding to Eq. (4.9.8) is, (m = 1): 


1 p3 


and note that the function 


Š(L, A, 0,6) = A0 + J VZEL, A) — Vale) do! 
o_(e(L,A),A) 


solves the Hamilton-Jacobi equation 
as as 
do’ 06 


From this fact, infer that S generates a change of coordinates (completely canonical) 


H( ,0,0) =e(L, A) 


(Pe, po, 0,0) (L, A, £, g) 


where £, g are angular variables defined in Eqs. (4.9.31) and (4.9.32) in terms of the data: 


2m 
cs Ti(e(L, A), A) 


and the Hamiltonian in the new variables is simply H = e(L, A). 


to, g=0 S(to,e(L, A), A), 


3. In the context of Problem 2, define 


8(B, A, 0,0) = A0 + i, cea WOE VAM ad. 
= ; 


Check that this is a two-parameters local solution to the Hamiltonian-Jacobi equation 
as as 
H(—, —, 0,0) =E 

ðo 30 

(the parameters being E and A) and S generates a completely canonical transformation 
(Po, po, 0,0) — (E, A, T, œ) in which a is a constant angle and 7 varies linearly over time. 
Show that these new coordinates cannot be extended to a well-defined system of coordinates 
in the vicinity of a full trajectory of the motion if this trajectory corresponds to a quasi- 
periodic motion with two periods having irrational ratio. Note that this is not the case for 


the other coordinate transformation of Problem 2. 


4. In the context of Problem 3, the change of coordinates introduced there can be ex- 
tended to a well-defined system of coordinates in the vicinity of a full trajectory for which 
o+ (E, A) = +00 (i.e., in the vicinity of an unbounded trajectory) if limsup,_,4.. VAa (0) < 
E. In this case, the pair of variables (H,7) are called “energy-time” coordinates. Why? 


5. Solve Problems 1 and 2 for arbitrary m > 0. 
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Leva dunque, lettore, a l’alte ruote 
Meco la vista, dritto a quella parte 
Dove lun moto e l’altro si percuote; 

E li comincia a vagheggiar ne l’arte 

Di quel maestro che dentro a sé Pama 
Tanto che mai da lei l’occhio non parte, 
Vedi come da indi si dirama 

L’ oblico cerchio che i pianeti porta 


Per sodisfare al mondo che li chiama.® 


The main result of §4.9, expressed by Proposition 22, is that the motion 
of a point mass in a central force field under some hypotheses on the initial 
data is a quasi-periodic motion with two periods given by Eqs. (4.9.18) and 
(4.9.19) depending upon the energy E and the areal velocity 4A. 

By contemplating Eqs. (4.9.18) and (4.9.19), it is easy to convince oneself 
that in general Tı (E, A) and T2(E, A) are “independent”. Hence, unless 


T,(E, A) 
T(E, A) 


which is “exceptional” when E and A vary, the motion is actually quasi peri- 
odic and not periodic. 

Note, however, that the set of the space points where Eq. (4.10.1) holds 
will generally be dense in the region W where the motion is integrable. As an 
exercise, the reader may show the truth of this statement near a point of W 
where the E and A values are such that the Jacobian determinant of the map 
(E, A)— (Tı (E, A), To(E, A)) does not vanish. 

However, there are two exceptional and marvelous cases. 

The first, already implicitly studied in §4.1, is the harmonic oscillator 
bound to O by a force with potential 


= rational number (4.10.1) 


V(e) = =e (4.10.2) 


leading to 


8 In basic English: 
Look up now, reader, to the high wheels 
together with me, straight there 
where several motions hit each other. 
And there begin to wonder about the art 
Of that master who inside himself moves them with his love 
so much that he never drops his eyes away. 
Look up how the oblique circle bearing the planets develops there 
to satisfy the world that calls them. 
(Dante, Paradiso, Canto X) 
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27, = To = 274/ fe =T (4.10.3) 
m 


and the orbits are ellipses centered in O. Equation (4.10.3) could be proved 
by computing the integrals of Eqs. (4.9.18) and (4.9.19) (which is a long but 
straightforward calculation). However, the reader should try to find a simple 
argument leading to Eq. (4.10.3) without any explicit calculations beyond the 
ones already done in 84.9. 

The other case corresponds to 


V (o) = sm (4.10.4) 
This is the case of the so-called “Newtonian two-body problem” or “Kepler’s 
problem”. If E < 0, the motion is periodic and Tı = To, although Tı and T> 
now actually depend on A and E, and the orbits are ellipses with focus in O. 


We treat this problem in some detail by proving the following proposition. 


23 Proposition. The motions with energy E < 0 and areal velocity 4| A| 4 0 

are periodic and the integrals of Eqs. (4.9.18) and (4.9.19) coincide, VE < 

0,VA #0. Furthermore: 

(i) the trajectories t > o(t), t E Ry, are ellipses with focus in O; 

(ii) such ellipses are run with constant areal velocity 4; 

(iii) the ratio between the square of the revolution period T and the cube of 

the length of the ellipse major axis is a constant solely depending on g. 
Finally, if o} and o— are the focal distances of the ellipse on which a given 

motion takes place: 


mg 
_=— 4.10. 
o+ +o- =i (4.10.5) 
mA? 
sen 4.10. 
0+0- = p (4.10.6) 


2 (4.10.7) 


Observations. 

(1) (i), (Gii), and (iii) are Kepler’s laws. Starting from them, Newton realized 
that if one wanted to describe the motion of a planet by a second-order dif- 
ferential equation m@ = F(@), the only possibility was that F(@) = err 
This led him to assume, by symmetry, g = kM, M = mass of the Sun, i.e., 
V(o) = -k which is the universal law of gravitation. 

Of course, he also assumed that (i), (ii), and (iii) would describe the motion 
laws of an arbitrary body revolving around the Sun, whatever its initial posi- 
tion and speed. 

Newton’s argument is interesting and different in spirit from the one based on 
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the analytic theory of differential equations. It is based on some beautiful ge- 
ometric considerations relying on the theory of conic sections: it can be found 
in the first book of the Principia, [37]. 

(2) One could also easily study the E > 0 or E = 0 motions: they are not 
periodic motions and the trajectories become a hyperbola wing or a parabola, 
respectively. This is a simple exercise along the lines of the upcoming proof 
and it will be left to the reader. 

(3) The heavenly bodies have finite extension. Hence, if a satellite revolves 
circularly around a primary body (planet or Sun), turning always the same 
face to it, a situation in apparent contradiction to Kepler’s laws is produced. 
In fact, if the satellite is thought of as decomposed into small point masses, the 
points on one face rotate on an orbit with radius smaller than the orbit of the 
points of the opposite face. Hence, if one could neglect the mutual interactions 
between the points of the body, they would have to have a different rotation 
period around the main body (by the Kepler’s third law) and the satellite 
would disintegrate over time. This means that if the above catastrophic event 
does not occur, the body must be subject to some internal stresses (“tidal 
stresses”) which cannot be stronger “than the body’s material resistance” 
(otherwise, the satellite could not exist). So Kepler’s laws and the gravitation 
law provide a mechanism for explaining Saturn rings and why, in general, 
satellites stay quite far away from a planet (see problems at the end of this 
Section). 


PROOF. With the notation of §4.9, let Va(e) = mas — 2) and the angular 
momentum and energy conservation laws lead to 


. 1 A? g E 
24 +) 22 SG 
gicas Era n (4.10.8) 
or 
D 2 
ô“ = zE — Va(o)), (4.10.9) 


The graph of Vais illustrated in Fig. 4.8. 


_mg 
242 
» mA? 
PORA 


Figure 4.8: Gravitational potential in presence of the “centrifugal barrier ea 
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Hence, if E < 0, E > -ri the roots of the equation Va(o) = E are 
o-(E, A), o,(E, A) and they can be explicitly found by solving a second- 
degree equation in the unknown F Factorizing the polynomial in 3 given by 


E _ Yala) in terms of its roots +: 
m m 0+ 


= - —) (4.10.10) 
Q 
The radii ọ—- < o 4 are, as we shall shortly see, the focal distances of the 
ellipse on which the motion develops. They obviously verify Eqs. (4.10.5) and 
(4.10.6) because o7" + o2" = 3%, 040 = —24%. 

By Eq. (4.10.10) to rewrite Eqs. (4.10.8) and (4.10.9) as 


m 2o 


as (4.10.11) 


ġ= 2. (4.10.12) 


and suppose that for t = 0, it is o(0) = o—, 0(0) = r. Since the motion of p is 
periodic, being a solution to Eq. (4.9.14), and oscillates between ọ— and 04, 
this hypothesis does not affect the generality. 

Then in Eq. (4.10.11) the + sign holds for t € [0,4] if T is the period of 
the o-motion [Eq. (4.9.18)]: 


o+ (E,A) do 

T(E, A) = 2 | a (4.10.13) 
e-a) [Z(E Valo) 

Hence, for t € [0,4], Eq. (4.10.12) implies that @ is a strictly increasing 

function (as A > 0 by assumption) of t: thus @ can be regarded as a function 


of 0 instead of t so that Eq. (4.10.11) divided by Eq. (4.10.12) yields 


d 1 1,1 1 
eer oe eee | eens (4.10.14) 
dð 0- 0 0 O% 
For oọ— < @ < o+ this implies 
e d ? 
9-r= | = (4.10.15) 
N A T. 
which is an elementary integral. Changing the variable as y = ọ7!, after some 


algebra, one finds 


Tf ile A. 1 1,1 1 
= = -(— + —)+=(— — —) cos(0 — 7), 4.10.16 
o ACT 0- 2a ae ( ) ( ) 


showing that when 0 reaches 27, @ reaches o+. 
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The study of the trajectory for t € [T/2, T] proceeds likewise, changing the 
choice of sign in Eq. (4.10.11), and one finds that the trajectory still verifies 
Eq. (4.10.16), and at time T when o takes the value o_, the angle 6 takes 
the value 37. This means that after time T has elapsed, not only o but also 0 
take on the initial value (of course 0 has to be measured mod 27). Hence, the 
trajectory is closed because @ and ò also take on again the initial values, by 
Eqs. (4.10.11) and (4.10.12) (i.e., 6 = 0,0 = +) and because of the autonomy 


of the equations of motion. 

Equation (4.10.16), well known from elementary geometry, is the polar 
coordinates equation of an ellipse with focus at the origin, focal distances o_ 
and o+, and major axis along the x-axis (and “perihelion” on the negative 
x-axis). 

To compute the period of the motion, it suffices to calculate the integral 
of Eq. (4.10.13), elementary after the substitution y = o9~'. However, this 
calculation can be avoided by recalling that the ellipse is run with constant 
areal velocity 4 and, hence, T can be obtained by dividing the area of the 
ellipse of Eq. (4.10.16) by 4. This area is 


ret a (4.10.17) 
because the semi-axes of an ellipse with focal distances o+ and @_ are ge 
and ,/o;o_. Hence, 

Pie eres pg REE +o)? (4.10.18) 
2 eTA g ee ma 
by Eqs. (4.10.5) and (4.10.6). mbe 


4.10.1 Exercises and Problems 


Use the tables in Appendix P for the numerical values, when necessary. Prob- 
lems 1 through 9 are inspired from [6]. 


1. Let T’ be a heavenly body identical to the Earth. Could a satellite T” identical to the 
Earth (i.e., a twin) be eternally eclipsed by T’ while they revolve around the Sun S on a 
circular orbit in a one-year period? Compute the T’T”” distance as well as the ST’ distance, 
comparing the percentage difference between ST’ and the actual average distance between 
the Sun and the Earth. 


2. Could a point mass M have two homogeneous rigid gravitational satellites with radius 


è and mass u whose surfaces touch at a point at distance ọ from M? Find the necessary 


relations among ọ,ô,u,M assuming 6 < @ and to first order in g. Compute the force 
2 

T (“disruptive force”) due to the spheres contact. (Answer: ô < 0? 2h, T= (1 — 

races k being the gravitational constant. Suppose that the force 7 cannot be negative, 


i.e., that the two bodies can only “push” each other.) 


3. Same as Problem 2, but assuming that the body with mass M is a homogeneous sphere 
with radius R and that both the planet and the satellites have the same density o: M = 
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oR, p= o$). Show that to first order in ż there is no condition on ô but only a 
condition on the ratio between R and ọ. (Answer: T > 0—1 -— 38. ey > 00 > 2.29R.) 


4. Use Problem 3 to show that a heuristic estimate for the minimum distance of a planet 
to the Sun center is ~ 2.29R if R is the Sun radius. Compute @ in km and compare it to 
the orbital radius of Mercury, schematizing the Sun as a sphere with radius equal to its 
optically apparent radius. 


5. Same as Problem 4 to estimate at what distance from the Earth can one find the closest 
satellite with the same density (~ 2.29 x 6.3 x 103 km). Why can the artificial satellites 
gravitate much closer? (See Exercise 6.) 


6. Assume that a satellite to a planet is made of rock with density ø, cohesion force per 
unit surface y, and with diameter 6. Using Problem 3, find a heuristic estimate of how 
large must 6 be in order that the satellite cannot gravitate at distance @ from the planet 
(supposed to have the same density) if @ is in the forbidden band (@ < 2.29R). (Hint: 
Compute the tidal force 7 of Exercise 3 and compare it with the cohesion force m($)?4: if 


kE GPG) -1)> o($)?¥ the tidal force prevails over the cohesion force and the 


body breaks up.) 


7. Let o = 5.5g/em3, y = 100 kgw/cm?, k = 6.67 x 1078 cm3/g - sec?, o = 7.0 x 102 km, 
R = 6.33 x 10° km. How big should a rocky satellite be in order to apply the instability 
argument of Problem 4? Same for o = 2R. (1kgw = weight of a mass of 1kg at the Earth 
surface.) 


8. At what distance from Saturn can one find its closest satellite? Compare it with the 
distance of Mimas. 


9. Assuming that Saturn rings consist of rocky satellites with a cohesion modulus y like 
that of Exercise 7 and a density equal to that of Saturn (3.g/cm?), heuristically estimate 
how big can the rings stones be as a function of the radius r of the ring. Compare their 
maximum diameter with the observed width of the rings (~ 20 km). 


10. Solve explicitly Problems 1, 2, and 4 in §4.9 in the case of Kepler’s problem, explicitly 


computing L and e(L, A). (Answer: e(L, A) = or if V (o) = -+2.) 
11. Given a Kepler motion in RÌ with energy E, set a = erte, e= a = (eccen- 


tricity of the ellipse with focal distances 94 and @_), and set 


3 
Šk 
Pamei AEE OSIN aH: 


Applying Problems 1, 2, and 5 of §4.9, consider the canonical transformation (po, po, 2, 9) 
<> (L,G, £, g) associated with the generating function 


5(L,G,0,0) 0+ | °  fam(e(E) — Ve/m(0)) do, 
oe 


where e(L) = — km’ = E and o+ and e_ depend on L,G (being equal to 94(F, A), and 


o—(E., A)) i.e., consider the map I generated by 


Spt On T Bg? IT aG 
Applying Problems 1, 2, and 5 of §4.9 and Problem 10 above, show that I can be extended 
to the entire set of initial data such that G > 0, Eg = — aie < E < 0 and that the image 
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of this set of data via I has the form V x T?, where V = {(L, G) |G > 0, Ec < — oa < 
0} = {(L,G)|G > 0, L > 0}, and check that £, g are “angles”, (2, g) € T?. 


12. Show that the physical interpretation of the angle g canonically conjugated to G in 
Problem 11 is that of the longitude of the major semiaxis of the ellipse, while the angle 
£ conjugated to L, “average anomaly”, is £ = art, where t is the time necessary to reach 
the initial point of the orbit starting, say, at time zero from the “perihelion”, i.e. from the 
extreme point on the major axis closest to the center of force. 


13. Consider a point attracted to the origin by a gravitational force. Suppose that its energy 
is negative so that it moves on an ellipse and let (L, G, £, g) be its Keplerian coordinates (see 
Problems 11 and 12). Let (po, po, 0, 0) be the corresponding “natural canonical coordinates” 
(see Problem 2, §4.9) and let @ be the polar angle formed by the position vector with 
the major semiaxis of the ellipse on which the motion develops following the initial data 
(Po, po, 0,9). Call a,b and e the major semiaxis, the minor semiaxis, and the eccentricity of 
the ellipse, respectively, and write its equation as 


[see Eq. (4.10.16)] and define €, the “eccentric anomaly”, as 
o =a(l +e cos€). 


Find relations expressing ọ, 0,3 in terms of the Keplerian variables (L, G, £, g). Show that 


L=vV1 — e? ge = B+ 2esinß 4 sing } 


oe 


5 
B= — 2esin l4 gem Panag 


£=€+esiné, 


=l — esin l 4 £ sin2l +... 
£ 2 ? 


0 =g + B, 
o =p (1 — e cos 0)! = a (1 + e cos £). 
For a more detailed theory of the equation £ = € + esin Ẹ see problem 6, p.486 where the 
radius of convergence of the inverse function, the “Laplace limit” is discussed. (Hint: Use 
2 
Eq. (4.10.12) and £ = art to find that 2% = Getos p, (noting that 8 is the analogue 


de rod 
of the angle 6 of §4.10) and having used all the relations between 0+,0—,A,T in Eqs. 
(4.10.5)-(4.10.7). Use Eq. (4.10.11) to see analogously that 


dg_a 
de o 


a?e? — (o — a)?. 


Then integrate the first equation to express £ in terms of @ and the second equation to 
express £ in terms of € after changing variables as ọ = a (1 + ecos&). 
To prove the expansion for @ as an eccentricity series, consider the integral expression of 


—s3 
£ in terms of 8 found above and expand the function Pie in powers of e before 


integrating and, then, integrate term by term.) 


14. Using Problem 13, express the Cartesian coordinates of the position in terms of 
(L, G, 2,9), proving that 
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x =a(1+ecos&) cos(g + 8), y =a(1+ecos€) sin(g + 8), 


or 
P p t 
T= cos ł j = sin(g + 3 
1 — ecos 8 (+6) Y 1 — ecos 8 (9+ 8) 


where £, 3 have to be expressed in terms of (L, G, £, g) via the formulae of Problem 13. 


15. Using Problems 13 and 14, show that the Cartesian coordinates can be expressed 
correctly in terms of (L, G, £, g) up to second order in the eccentricity e as 


x =alcos(g + £) + eAx(g, £) + e? Bz (g, O] + O(e?), 
y =a[sin(g + £) + eAy(g, £) + e?°By (g, £)] + O(e*), 


where 


Ax =cosg+sinfsin(g + £), Ay = sing — sin£cos(g + £), 


3 3 
By, =sin gsin £ — A sin(g + £) sin 22, By = — cos g sin £ + J cos(g + £) sin 2£, 


which, calling € the eccentricity to avoid confusion with e = 2.71 ..., can also be written in 
complex form 


x+ iy =ae(9t [1 + (efet! — isin L) + e?( i(sin Je" + Žisin 20)]. 


16. Consider the problem analogous to Problems 10 and 11 in the case of the Kepler 
motion in R3 and look for a completely canonical transformation between the natural 
polar coordinates (Po, py, Po, 0, 2,0) in terms of which the Hamiltonian is 


1 


2m 


pa km 


2 
Pe, PB) 
2 3 


(esin0)? o e 


(p34 


(here o =radial distance, y =longitude and 0 =latitude) corresponding to the Lagrangian 
> (2? + o%(sin 8)*p* + 076?) + k— 
Q 
and the coordinates (L, G, O©,£,g,T), where L,G,© are defined in terms of the energy E, 
the areal velocity A and of the orbit inclination i with respect to the z-axis by 
3 
mk 
L——, G=mA, O=Gcosi 
V-2E 
and £,g,7 are their canonically conjugated variables which will turn out to be £ = (average 
anomaly in the ellipse plane), g = (longitude of the major semiaxis of the ellipse in its plane 
measured, say, from the nodal line, i.e., from the line of intersection of the ellipse plane and 
the (i, j)-plane of the inertial reference system (0; i,j,k) to which the motion is referred), 
T = (longitude of the nodal line of the ellipse plane in the plane (i,j) measured, say, from 
i, “angle or ascension” ). 
Show that, if e(L) = ats the above transformation is completely canonical and is gener- 
ated by the solution of the Hamilton-Jacobi equation 


1 (88,5 1 (08.5. 1,88.,\ km 
EN fy gtd ey Ak eee 2), 2 eG 
(=) + Fant Bp. + (3G o ett) 


parameterized by L,G, © and having the form (solution with “separation of variables”) 
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T aa aac T 
S(o,0, 9; L,G, 0) = =G —~)90+¢6¢,0(8) + Fa,1(Q) — rad 


with 
= i a. o? 
dð sin? 0” 

dog.e\2 m3 k2 G? m 
—) =2 L) - =2 z> tk—). 

( ) =2m(e(L) — Ve/m(@)) = 2m( - 2 ma =) 

Hint: The latitude @ varies between 0- = 5 —i and 64 = 4 +i, assuming to have chosen 
2 2 


5: The @ variable varies between 


o— and g_. Then 0_, 6+, 0—, 0+ can be computed from L,G, ©. Write 


rs 0 Q2 ad o 
a.o (ê) = J dera T u Farle) = / 2m(e(L) — Ve/m(o) de! 
o sin“ 0 oe 


and note that the variables 7, g, @ are defined by 


the axis normal to the ellipse plane oriented so that i < 


T ača, o 
2 3O 


OGG,L T _ TG, 


3G (o) £ 


_ 96,0 
g 2’ aL 


OG 


(9), 


(9) 4 (o). 

In the new variables, the Hamiltonian becomes e(L) so that the Keplerian evolution is, 
in these variables, 7 = constant, g = constant. So we can compute T and g by choosing 
special phase-space points on the orbit. To find the meaning of 7, consider the time when 
the point occupies the “highest position”, 0 = 0—. Show that 22 (6_) = 0, noting that the 
argument of the integral for @g,6 vanishes for 0 = 64. Hence, T = =F + when 0 = 6_. 
Geometrically, this means that 7 is the angle formed with the x-axis by the line in the ry 
plane orthogonal to the projection on the xy plane of the normal to the orbital plane, i.e., 
the nodal line. 

Similarly, to find the meaning of g, consider the time when @ = ọ— (i.e., the point is at the 


é ; 0G 
perihelion). Now, so (o—) = 0 and 


= ©? 1 T cos? i 
~ G2 sin?6 sin? 6 
where ĝo is the polar angle corresponding to the perihelion position. This relation can be 


interpreted as saying that g — E is the time necessary for a point moving according to the 
equation 


g 


o? ı 

G2 sin? 0 
to go from 0— to 09. On the other hand, it is easy to see that the above equation also 
describes the 0 variation over time in a circular uniform motion on the unit circle in the 
plane of the ellipse (inclined by 7) with unit speed. So g — 5 is the angle between the 
major semiaxis of the ellipse and the intersection between the ellipse plane and the plane 
containing the normal to the ellipse and the z-axis (“azimuthal plane” of the normal), since 
the angle between the latter line and the nodal line is at it follows that g has the desired 
interpretation. The angle £ has the same expression found for the planar case (see problem 
11). Hence it has the same interpretation of average anomaly). 


i = 


17. Express the Cartesian coordinates of the position corresponding to (L, G, ©,£,g,T) of 
Problem 16. (Hint: Use the results of Problem 15 directly.) 
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Consider N point masses, with masses m1,...,my > 0, subject to an ideal 
constraint imposing that the system be rigid and have a fixed point O. Suppose 
N > 3 and that the points are not aligned. 

We shall describe the motions in a reference frame (O;i,j, k), convention- 
ally called “fixed”, and we shall fix a “comoving” frame (O}i1, i2, i3) with axes 
suitably chosen. 

To determine the position of the body, it will suffice to give the position 
of the reference frame (O;i1,i2,i3) since, in this system of coordinates, the 
i-th point has constant coordinates by the rigidity constraint. We shall use 
the Euler angles (0, p, Y) to define (O; i1, iz, iz); they are defined in §3.9, Fig. 
3.3 (see Fig. 4.9): 


1 

Figure 4.9. Euler angles of “comoving” frame (O;3i1, iz, iz) in fixed frame (O; i,j,k). 
The kinetic energy can be expressed in terms of the angular velocity w of 
(O;i1, i2,i3) with respect to (O;i,j,k), [see Eqs. (3.9.11) and (3.9.12)]: 

w =0T+ K+ Vis (4.11.1) 
In fact, the velocity of the i-th point can simply be written as 

x —wA(P;—O) (4.11.2) 
(see footnote 10, p.202, last formula). Therefore 
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by vector calculus, see Eq. (3.9.15), where 


N 
Iw = X (Pi - 0) A^ (w A (P; — 0)) = Ko. (4.11.4) 
i=1 
The components of Iw in the co-moving frame (O;i1,i2,i3) have, by Eq. 
(4.11.4), the form 


3 
(Ioja = So Lape, a=1,2,3 (4.11.5) 
p= 


and from Eq. (4.11.4) it is easy to check that 


N 
Tap = >> m [(P; — O}? Sas — (Pi — O)a (Pi — O)g, (4.11.6) 
i=1 
for instance by using the identity a A (b Ac) = (a - c)b — (a-b)c. Since 
the components of (P; — O) in (O;i1,i2,i3) are constants, by the rigidity 
constraint, the nine numbers of Eq. (4.11.5), actually six since Iag = Iga are 
characteristic constants of the body associated with the frame (O; it, ig, is). 
At this point, it is convenient to choose the co-moving frame so that the 
matrix J (“inertia matrix”) is as simple as possible. 
Note that by rotating the ij, ig,is axes to i,, 14,15 the coordinates of the 
vectors (P; — 0) become (P; — O)/,, a = 1,2,3, in the new frame, related to 
the old coordinates by 


3 
(Pi - O)a = >> Rap(P; — OY (4.11.7) 


and R is an orthogonal matrix RRT = RTR = 1 (Rag = ia ` ig). And, vice 
versa, any orthogonal matrix corresponds to some frame (O; i1, 15,15) so that 
Eq. (4.11.7) gives expresses the change of coordinates. 

Therefore, the inertia matrix depends on the co-moving frame and in 
(O; i, 15,15) it becomes I’ related to I by 


I= RI'R" (4.11.8) 


by Eqs. (4.11.7) and (4.11.6), in matrix notations. Then we can choose R so 
that I’ becomes 


L 0 0 
r=|0 b 0 (4.11.9) 
0 0 Ts 
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where 0 < Iı < Ig < Ig. Such an R exists because J is a symmetric positive 
definite matrix? and every such matrix can be “diagonalized” by an orthogonal 
transformation (see Appendix F). 
Hence it is not restrictive to suppose, since the beginning, that the choice of 
the comoving frame (O; ij, i2,i3) is such that I takes the form of Eq. (4.11.9). 
With this choice of the co-moving axes, the kinetic energy and the angular 
momentum become [see Eqs. (4.11.3), (4.11.4), and (4.11.5)] 


1 
T = 5(hwj + bw) + Tsw), (4.11.10) 
Ko = hw 1, + Ip we i2 + [3 w3 i3. (4.11.11) 


To write the Lagrangian function describing the motion of the body, with 
O fixed and subject to no force other than that of the ideal constraints of fixed 
O and of rigidity, it will be enough to express the kinetic energy in terms of 
the Euler angles (Fig. 4.9) and of their time derivatives, through Eqs. (4.11.1) 
and (4.11.10). The components of w become explicitly 


wi = P cos Y + Bsind sin’ (4.11.12) 
w = sin Y + sin I cosh (4.11.13) 
ws = pcos +% (4.11.14) 


by Eqs. (3.9.3) and (4.11.1). The result is not particularly illuminating in 
the general case and we write it only in the “gyroscope case” when, say, 


Iı = Ip“! I. One finds 
1. = . 272 1 ais 7. 2 
L=51@ + sin“ 69 ) + 5 1s Geos8 +y) (4.11.15) 


Before treating the general case, let us study the system described by Eq. 
(4.11.15), i.e., the gyroscope. In this case, the results are easier and particu- 
larly suggestive. 

As is often the case, it is not convenient to write down only the Lagrange 
equations for Eq. (4.11.15) and discuss them. It is better to combine them with 
other information which can be obtained by general conservation principles 
(of energy and angular momentum, in the present case). Such information, 
although implicitly present in the Lagrange equations, is not very obvious 
there. 

Since Ko is a constant of the motion, given a motion t — (A(t), P), Y(t) 


with initial datum (00, Yo, Vo, 90, Fp; Wo), we can suppose without affecting 
generality that Ko is parallel to some fixed axis k: 


Ko =Ak, A>0. (4.11.16) 


? Since tw - Iw = (kinetic energy of the body) > 0,Vw € R3, and it can vanish only if 
w = 0 because the points are assumed to be not aligned, see Eq. (4.11.2). 


310 4 Special Mechanical Systems 


(the A = 0 case corresponds to a motionless solid which remains such forever, 
of course.) 

Let (O;i,j,k) be a reference frame with z-axis oriented as k and choose 
i on the intersection between the (i,j) plane and the (i,j). We suppose that 
such planes do not coincide (otherwise, we change (i, j)). 

The motion in this new fixed frame (O;i,j,k), whose definition however 
depends on the initial data, will be discussed calling (6, p, Y) the Euler angles 
of (O;i1, i2,i3) with respect to the frame (O;i,j, k). 

The components of Ko = Ak in the co-moving frame are expressed. see 
Eq. (3.9.3), in terms of the new Euler angles as 


(Ko)3 = Acos#, (Ko)g=Asin@siny, (Ko)3 =A sin cosy. 
(4.11.17) 
By relations like Eqs. (4.11.12)-(4.11.14), written with the new angles, the 
angular momentum conservation gives the following relations: 


Acos6 = I3w3 = I3(¢cos0 + ), ) (4.11.18) 
Asin @ cosy = [w2, (4.11.19) 
Asin @sin yb = Iu, (4.11.20) 


which are three differential equations for the three unknowns 9, p, Y and A is 
a constant [w1,w2 are also expressed in terms of the angles 6, y, Y and of their 
derivatives by relations like Eqs. (4.11.12) and (4.11.13)]. 

Instead of discussing the above equations, which, in principle, should be 
sufficient to determine the motion, we shall combine them with some of the 
Lagrangian equations associated with Eq. (4.11.15), written in the new 9, p, w 
variables (i.e., without the overbars). The analysis is based on [28]. 

Since Eq. (4.11.15) does not explicitly depend upon y,w, one deduces 
two conservation laws from Eq. (4.11.15) by writing the Lagrange equations 
corresponding to the variables y, ~: 


<Is(Ge0s0 +d) =0, (4.11.21) 
corresponding to w and 
£ (Isin? 0 + Ig( pos +d) cos? = 0 (4.11.22) 


corresponding to y. 

Equations (4.11.18)-(4.11.22) form a redundant system: but they easily 
determine the functions (@(t), p(t), Y(t)) in terms of the initial data. 

In fact, Eq. (4.11.21) implies that ġ cos 0 +) is constant as t varies; hence, 
Eq. (4.11.18) implies that cos @ is constant, i.e., 


OE) = bo (4.11.23) 
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(remark that this holds only in the reference frame (O;i,j, k) chosen after the 
particular motion had been selected, which is very special, for instance 6=0 
and thus ĝo = 0.) 

Using Eqs. (4.11.23) and (4.11.21) in Eq. (4.11.22), we see that ~ = 0, i.e., 


y(t) = po + Got (4.11.24) 


Then the constancy of ~ and of 0 and Eq. (4.11.21) imply that w is also a 
constant: 


b(t) = Yo + vot (4.11.25) 


Hence, Eqs. (4.11.23)-(4.11.25) provide a full description of the motion in the 
chosen coordinates (which, we stress once more, is a reference frame depending 
on the motion itself, having z axis parallel to the constant angular momen- 
tum). It appears that the motion expressed in the Cartesian coordinates is 
quasi-periodic with periods 


27 20 

==>, 19> >>: 

Po vo 

It follows that the motion is quasi-periodic but generally not periodic, al- 

though the set of the initial data for which an is rational is a dense set of data 
0 


Tı (4.11.26) 


lying on periodic orbits. 

By Observation (5) to Definition 10, p.288, the above system should be 
integrable in the sense of Definition 10, p.287, on vast regions of the data 
space. 

Let us study the general case, assuming 0 < I < Ig < Iz and using a 
method, inspired from [28], quite different from the preceding one. 

As before, given a motion, the angular momentum is a constant together 
with the kinetic energy. This implies 


hw? + hws + I3w3 = 2E = const, (4.11.27) 
Fw? + Bw + Bw = A? = const, (4.11.28) 


giving two of the three component of w in terms of the third: 


(2EIs = A?) = (I3 = Iz) Inw3 


FE ea secs ea eS cee eee 4.11.29 

i U3 — hh) ( ) 
(A? = 2E) = (Ip — I) Inw3 

gai) Ss See 4.11.30 

3 Is (Is - h) ( ) 


To find an equation allowing the determination of w2 one can remark that 
Eq. (4.11.28) contains less information than the constancy of the angular 
momentum as a vector. 

In fact, the angular momentum conservation means [recalling Eq. (3.9.12)] 
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dK d 
0= 2. = — (h w 1, + Ip wo i2 + I3 w3 is) (4.11.31) 
dt dt 
ee ee eee ee ee 
S11 0111 2 W2 12 3 W3 13 141 dt 2 W2 dt 3 W3 dt 


=h wii + h Wo ig + I3 w3 ig + lı w1 w A iy + Ig wow A ig + I3 w3 w A ig 


which, written in components on (O; i, ig, is), is 


Tw, = (h — I3)wows, (4.11.32) 
hws = (Iz — K)w3ur, (4.11.33) 
Bws = (Ih — In)wiwe. (4.11.34) 


These very beautiful equations are the “Euler equations” for the motion of 
the solid. Equation (4.11.33) together with Eqs. (4.11.29) and (4.11.30) give 
the equation for we: 


pady EP EAE GHEE its) 


and the discussion of the choice of sign in Eq. (4.11.35) leads to the usual 
result: initially, w2 has some sign which is kept until it vanishes, then the sign 
changes until the next time w2 vanishes, etc., alternating’? (see §2.7). 

Hence, recalling §2.7, Eq. (4.11.35) tells us that wz varies over time as the 
abscissa of a point mass with mass 2, total energy 0, moving under the action 
of a conservative force with potential energy: 


Vg a(z) = rs (4.11.36) 


Therefore, t —> w(t) is a C®-periodic function of t oscillating between two 
extreme values a4} (E, A),a-(E, A) which are the extremes of the smaller 
of the two intervals (—a1,a1), (—a3,a3) with a; = roots of Vg a(x) = 0, 
aj >0,j =1,3: 


2EI3 — A? | A? —2Eh 
a,(E, A) = 1 | ==, a3(E, A) = | =, 4.11.37 
1( ) Bie) 3( ) Lh- h) ( ) 
provided 


10 As in the one-dimensional conservative problems, if w2 vanishes initially the choice of 
sign for t > 0 and small can be inferred from the initial value of w2, (see §2.7). 
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ai(E, A) 4 a3(E, A); (4.11.38) 


otherwise, the equation Vg a = 0 has only two solutions, +a and Ve, 4 vanishes 
there so that the motion, by the analysis of §2.7, will be aperiodic. 
The period of t — w(t) is 
a+ (E,A) d 
Tie A) = 2 | TEL (4.11.39) 
a-(E,A) W—Ve,a(2) 


and a better expression for w2(t) can be obtained by defining 


t> Q(t, E, A), teR, (4.11.40) 
to be the solution of 22 = — 22.4 (2), hence, of Eq. (4.11.35), with initial 
datum 

2(0,E,A)=a_(E,A),  2(0, E, A) =0. (4.11.41) 
Then 


w(t) = R(t + to(we(0), w2(0)), E, A), (4.11.42) 


where to(w2(0),w2(0)) is the minimum time necessary in order that the so- 
lution (4.11.40) “reaches” the datum w2(0),w2(0). Furthermore, for 0 < t < 
iT, (E, A), it is 


Q(t, E,A) d 
r= f a (4.11.43) 
a_(B,A) /—Ve,a(2) 


To find the motion t — (0(t), y(t), Y(t)), we have to go back to the equa- 
tions expressing the conservation of angular momentum and its identity with 
Ak, assuming, again, to have chosen a reference frame (O;i,j,k) with k and 
Ko, parallel and i along the node line of the planes (i,j) and (i,j), see Eqs. 
(4.11.18)-(4.11.20). Now 


Iw = A cos, hwə= A sin siny, Kw, = Asin cosy (4.11.44) 


tell us that 


O(t) = arccos -u (4.11.45) 
= 2W2(t) 
w(t) = tes D (4.11.46) 


where the determination of the arc-tangent has to be chosen so that t > y(t) 
is continuous. 

From Eqs. (4.11.12) and (4.11.13) written without overbars (i.e., for the 
Euler angles of (O; i1, i2,i3) with respect to (O;i,j,k), we deduce ¢: 
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. wisin +wzcosy hw? + ws 

7 sin 0 — Po? + we 

where the second equality follows from Eqs. (4.11.19), (4.11.20), and (4.11.18) 
(recalling that we are supposing Ko parallel to k). Let 


(4.11.47) 


LR (t, E, A)? + InQe(t, E, A)? 
P Q(t, E, A)? + [3 R(t, E, A)?’ 
where N1, is connected with 2 as wı with w2 in Eq. (4.11.29) (note that the 
sign ambiguity has no relevance here). Then Eq. (4.11.47) becomes 


(t, E, A) =A 


(4.11.48) 


$ = (t + to(w2(0), w2(0)), E, A). (4.11.49) 


Using the periodicity with period Eq. (4.11.39) of t > S(t, E, A) and calling 
(Xn(E, A))nez the Fourier coefficients of this function, it is 


+00 . 
D,E, A) = Y xn(E, Aent, (4.11.50) 


n=— 00o 


and, by integrating Eq. (4.11.49), 


p(t) =~o + xo(E, A)t 


4.11.51 
+ S(t-+ to(w2(0),d2(0)), B, A) — S(to(w2(0),ti2(0)),B, A), “At 
where 
+020 tient 
eT Œ.A) 
S(t, E, A) = XL Xn(E, A) ee (4.11.52) 
nT oo Ti (E,A) 


which is a C™-function periodic in t with period Tı (E, A). 
Equations (4.11.45), (4.11.40), (4.11.42), (4.11.51), (4.11.46), (4.11.29), 
and (4.11.30) give a complete description of the motion under investigation. 
The analogy of the above results with those of the two-body problem lead 
to the formulation of the following proposition. 


24 Proposition. The motion of a solid with a fixed point and inertia moments 
0< I, < < Iz is integrable in the sense of Definition 10, 84.8, p.287, in 
a family of regions covering the region W of the data space where A # 0, 
a3(E, A) # a(E,A) [see Eq. (4.11.38)], and in such cases the motion is 
quasi-periodic with two periods: 


a+ (E,A) dx 


T(E, A) aa 


a (4.11.53) 
a_(E,A) /—Ve,a(2) 
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27 T Tı(E, A) 
EA) Appeoa Ge) 
a_(E,A) /—Ve, a(x) I) (2H—A?)—IgI3(lg—hh) x? 
(4.11.54) 
and a4(E,A) are the two positive roots of the smallest modulus of Vg a(x) = 
0, with Vg a being defined in Eq. (4.11.36). 
Similar (and simpler) results hold if h = In # I3, h # In = 13, h = h = I. 


T(E, A) = 


Observations. 

(1) The proof of Proposition 24 is essentially a different way of stating what 
has already been discussed above. The analysis of this section (as well as that 
on the central forces) is a classical proof. Somehow, it seems unsatisfactory 
because it looks like “magic”, with its use of redundant equations chosen, 
without apparent a priori logic, to reach the goal of finding explicit expres- 
sions for the motions. However, with further thought, it appears quite simple 
and, in particular, no need of the theory of elliptic functions emerges (a claim 
referring to deeper analysis of the properties of the quadratures discussed 
above). 

(2) However, there is a deeper critique of the above deductions. It is not at 
all clear that the systems are canonically integrable in the sense of Defini- 
tion 11, §4.8, p.289. This becomes very serious when one tries to study by 
the Hamilton-Jacobi theory the perturbations provoked by small conserva- 
tive forces on the above simple motions. The reader will realize this problem 
more clearly in §5.10-85.12, where the theory of the Hamiltonian perturba- 
tions based on the Hamilton-Jacobi equations is developed. 

In the problems to 84.9 and §4.10 we have shown, however, how to deduce 
for the central motions complete canonical integrability from the integrabil- 
ity proof. Likewise, in the problems of this section, we show how to deduce 
canonical integrability of the solid motion from parts of the proof of the above 
proposition. The derivation is simple and nice, not so much because it leads 
very quickly to the quadrature formulae (4.11.42), (4.11.47), (4.11.46), and 
(4.11.52), but mainly because it achieves the proof of canonical integrability 
at the same time. This integrability property had always been discussed ei- 
ther abstractly or quite obscurely until recently when the “Deprit canonical 
transformation” was introduced. 


PROOF. We discuss the proof in some further details because it is useful to 
illustrate Observation (5) to Definition 10, p.288. 

Let (O;i,j, k) be the fixed frame and let (O;i,j,k) be the “adapted” fixed 
frame chosen, once a particular motion is given, with the k axis parallel to 
the angular momentum. Suppose that i is parallel to the node of the planes 
(i,j) and (i,j) (i.e., to their intersection). 

To determine the initial datum in the J4 = I, case, we use the following 
coordinates: 

(1) the angle y between i and i; 

(2) the angle ô between Ko and k; 
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(3) the Euler angles y, Y of (O;i1,i2,i3) in (O;1,j, k); 

(4) the angular velocity variables ¢ and i). 

From the preceding analysis, it follows that the motion of the system has 
three prime integrals (ô, h, w) and, given them, it is described by the points 
(7.¥,v) € T3, and the time evolution on T? is described by quasi-periodic 
flow with pulsations 

o(d, $, Y) = 0, 02(4, p, Y) = $, aos(ô, p, Y) =v, (4.11.55) 
having denoted them with ø instead of w to avoid confusion with the above 
angular velocity components. 

The integrating map is thus I(0, Y, Y, 0, Y, 7) (6, p, 1), 7, p, Y). It should 
still be checked that this map is C'° nonsingular and invertible on a suitable 
family of neighborhoods W” which, as one uses the arbitrariness of the choice 
of (O;i,j,k), cover W. We do not enter into this analysis. 

In the general case (I, < Iz < I3), we replace the variables (6, ¢, 1b) which, 
with the exception of ô, are no longer conserved with the variables (6, Æ, A), 
and we also replace the angles which no longer rotate uniformly with the 
exception of y which is constant, with where (y, Ø, Y) where 


D= gitl) a0), P= p- Sltolwa(0), (0), E, A) 
(4.11.56) 
[see Eqs. (4.11.51), (4.11.52), and (4.11.42)]. 
By the discussion preceding Proposition 24, it appears that y, p, y are 
angles rotating with pulsations 


o1(A,E,5) =0, 02(A,E,5)= nem 93(A, E, ô) = mam (411.57) 


This follows after some contemplation of Eqs. (4.11.42) and (4.11.51). 

Again we do not enter into the analysis of the regularity and invertibility 
of the integration map 1(0, py, 0, D, 7) (6, E, A, 7, 9, p). 

Note that the coordinates chosen in the general case do not reduce to those 
of the symmetric case (Jy = I2) when In > lh. 

However, there is great arbitrariness in defining the prime integrals because 
any function of 6, E, A is still a prime integral, and it is possible to find two 
other prime integrals $, W becoming ~ and i) in the I, = Iz case. In fact, let 


AGA) h(t)? + RE? 

p= aro! pee ee dt, (4.11.58) 
T(E, A) Jo TAH? + Hae)? 

where §2(t) is defined in Eq. (4.11.30) and 2)(t) is related to N by Eq. 

(4.11.29) with R(t) replacing 3(t) [and, likewise, we could define 23(t) by 

Eq. (4.11.30)]. 
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Note that ® is the average value of ¢ along a period [since ¢ is periodic 
with period Tı (E, A), see Eq. (4.11.47)]. 

Analogously, from Eq. (4.11.12), (4.11.13) written without overbars, one 
can find an expression of w in terms of w1, w2, w3: 


(A? ms 2ET3)ws 
Rw? + Tow 


p= (4.11.59) 


So w is a periodic function with period Tı (E, A) and we can define the prime 
integral 


i T (A? — 2EI3) R(t) 
0 


y = — eee Pe SA 
T(E, A) PA? + BH? 


where the ambiguity of the sign in the definition of 23 has, now, to be resolved 
by remarking that w3 from Eq. (4.11.30) never vanishes if A 4 0 and, therefore, 
it has a constant sign which we attribute also to (23. 

It could also be possible to change w to a variable reducing to w, when 
I> — Iı. However, we shall not do this. 

It remains to check Eq. (4.11.54); Tz = =: 


(4.11.60) 


9 17; (B,A) 
E, A) = = P(t, E, A)dt 4.11.61 
xo( ’ ) nen | (t, ’ ) ( 6 ) 
and changing variable t + (t, E, A), one has [see Eq. (4.11.43)] 
Fee eee (4.11.62) 
—Vg a(2) 


Hence, recalling that 21, can be expressed in terms of 92, we can express the 
integral on the right-hand side of Eq. (4.11.61) as an integral over the variable 
N, via Eqs. (4.11.48) and (4.11.29) and, after some algebra, Eq. (4.11.54) 
follows. mbe 


4.11.1 Problems and Complements 


€ 


1. Let Č be the Lagrangian function describing the motion of a rigid body in a fixed frame 
(0;i,j,k 


) in Euler angle coordinates 
Pa Lee > ee Gem, l cee (atl es oe 3 Rene ee TRY) 
L= zO cos yp + Gsin 0 sin Y)“ + 5 2( bein + Psin 8 cos w)* + 5 [3(P cose +) 
Compute the canonical variables pg, Pp PI associated with 0, 7, % via Č. Show that if Ko 
is the angular momentum of the solid with respect to the fixed point O, and if n is the node 


line unit vector, then 


py = Ko: n, po = Ko-k, Py = Ko «is. 
(Hint: Just apply the definition of p [Eq. (3.11.1)] and use Eqs. (4.11.11) and (3.9.3).) 
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2. Call A = |Ko|, Kz = pg, L = Ko -is Py and let (y, Y, 7) be the angles considered on 
p.318 (see Fig. 4.10). (Kz, A, L, y, p, Y) are the “Deprit variables” ([13]). 


d TO T . . d . . . d T T . . 
aY EDAG) n (iin, m=! AnG) si 


\ 

i=m z vy 

Figure 4.10: The Deprit angles. Here T is the node line (i,j) N (i1, i2), n is the node line 
(i1, i2) M (i,j) and m = i is the node (i,j) N (i,j). The j axis is not drawn. 


Show that given (pq, PE Pp 0,9, »), the Deprit variables are determined and vice versa. 
(Hint: Note that pp = Acosé = Kz , Py Acos 6 = L, pg Asin6@ sin(w — 7%), and note 
that the angles ọ,0, Y — Y, p — y, ô, 0 can be arranged in a spherical triangle (Fig. 4.11). 


e-7 
Figure 4.11.The spherical triangle associated with the Deprit’s angles. 


Therefore, given the Deprit variables, one computes pg = Kz, then cosô = AB. then 


P= L, then cos 0 = Ł, Hence, at this point, one knows the elements y, 0, 6 of the spherical 
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triangle in Fig. 4.11 and by solving it one computes, by spherical trigonometry, the three 


other elements, i.e., Y — Y, p — 7,9, and since y, are known, one gets p. Consider the 
spherical triangle of Fig. 4.12. 


3. Consider the spherical triangle of Fig. 4.12. Check the basic spherical trigonometry 
relations: 


1) cos C = cos A cos B + sin Asin B cos y 
2) cos y = — cos Q cos 3 + sina sin B cos C 
3) sina _ sin 8 = sin y 

snA sinB sinC 


4) sin C cos 8 = cos B sin A — sin B cos A cos y 


5) cos A cos y = sin A cot B — sin y cot 8 
6) dA = cos BdC + cos ydB + sin B sin yda 


(Hint: Draw the spherical triangle in Fig. 4-12 by locating the vertex 2 with the angle y 
on the z axis, the vertex 1 with the 8 angle on the xz plane: so that the three vertices 
are expressed in Cartesian coordinates as rı = (sin A,0,cos A), r2 = (0,0,1) and r3 = 
(sin B cos y, sin B sin y, cos B). Then 

to check (1) note that rı - r3 = cos C; 

to check (2) apply (1) to the spherical triangle formed on the sphere by the perpendicular 
to the planes containing the arcs A, B,C; 

to check (3) note that rı -r2 A r3 = sin Asin B sin y has to be symmetric in the interchange 
of the role of (A, a), (B, B), (C, Y); 

to check (4) remark that rı A r3 -j = — sin C cos ĝ; 

the identity (5) is a consequence of (1) and (4);) 


B 
Figure 4-12: Spherical triangle with the sides formed by the arcs A,B,C opposite to the 


angles a, 6, y. 


4. Show that the map (pq, pz; Py: 6,9, 0) — (Kz, A, L, y, p, Y) has the property (“Deprit’s 
theorem”): 


Kzdy + Ady + Ldy = pgdð + podp + prdy. 


(Hint: Using Problems 2 and 3, show that dy = cos 0d(% — Y) + cos ôd (F — y) — sin 0 sin 
w)d0; then substitute into the left-hand side using Kz po, L Pp and —Asin@sin(w 
Y) = pg) 


5. The map (p7, pp, Py 0,5, Y)— (Kz, A, L, y, 9, Y), defined in Problems 2 and 4 maps six 
variables into six others without any reference to a rigid body. Interpret Problem 4 as saying 
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that this map is a completely canonical map homogeneous in the variables in the sense of 
Proposition 22, §3.11, p.224. (Hint: Apply Proposition 22, $3.11.) 


6. Compute the rigid body’s Hamiltonian Hin Deprit variables, remarking that, by Prob- 
lem 5, it must simply be the kinetic energy expressed in these variables (see the general 
properties of the completely canonical transformations, §3.12), and show that 


2 sA 2 
H(Kz, A, L, 9,9, V) = aa + “(= + =t) (A? — 2). 

Deduce the Hamilton equations of the motion and check that they are identical to Eqs. 

(4.11.47) and (4.11.59). Use this Hamiltonian formulation to rederive directly the integra- 

bility of the motions of a solid with a fixed point. (Hint: Note that the kinetic energy can 

be derived from Ko = (V A? — I? sin wiv A? — T? cos w, L). Write the equations of motion 

and integrate by quadratures.) 


7. Using the Hamiltonian in Problem 6, show that the solid with a fixed point gives rise 
to canonically integrable motions (see Definition 11, §4.8, p.289). (Hint: Since the map 
(pg. Pp Pp OP, )— (Kz, A, L, 7, p,p) is completely canonical, it is enough to show that 
the Hamiltonian motions generated by the Hamiltonian in Problem 6 are canonically inte- 
grable. The H has a p dependence and, at the same time, it also involves A: but one just 
finds the canonical transformation (L, yw) (M, p) that integrates the 1-degree of freedom 
system in which A’ is considered a parameter and keeps track of the obvious implications 
on the other variables. The procedure is standard and it is discussed as an example. Define 
the canonical transformation with generating function 


(K, A’, M,y, pW) = Kiy+ A'e+ S(A’,M,y) 


with S chosen so that ® solves the Hamilton-Jacobi equation for H: 


(4 SC: (a woe) = ea! M) 

2 \ I3 Ig h Ow 2 qh In 

where the function e(A’, M) is naturally chosen so that the function S does generate a 
canonical transformation on the Hamiltonian H, regarded as a function of L,~ only (pa- 


rameterized by A’ = A), bringing it to action angle variables (M, u). By Problem 5, 83.11, 
this means that the function e(A’, M) has to be chosen so that 


Oe(A’,M) _ 

aM 

where w(A’, E) is the pulsation of the motion (of this one-dimensional system parameterized 

by A’) with energy E = e(A’, M). Since the equation of motion for ẹ in this auxiliary one- 
dimensional system is 


w(A’, E) 


oH, 
p = ape L,Y), 


the pulsation will be such that 


on = o dy o hee dw 
1 ` H á 
w(A’, E) o w(t) 0 -H (A’,L) 
where L has to be fixed so that H(A’, L,w) = E; i.e., L has to be taken as a function 
L(B, A’): 
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E- A'2/sore + i] 
L=L(E,A', y) = 2—2 B 


which permits us to compute w(A’, E). 
The function e(A’', M) can be computed in terms of its inverse m(A’, E) (such that 
m(A’,e(A’, M)) = M), since om )(A’, E) must be 


ðe, - 
(a4 MD) = sa 


So, for instance, e can be defined by inverting the relation: 


E dE! 
m(A', E) =f DAB)’ 
E0(A’) w( ; ) 


where Eo(A’) = minz y H(A’, L,w). 
Coming back to S, we see that 


ap 
S(A', M, Y) = f° IEA, M), A’ 9) dy! 
0 

is the explicit solution of the Hamilton-Jacobi equation (recall the expression of L). 

The above -generated canonical transformation leaves A, Kz,-y unchanged and changes 
L to M, ọ to some new y’ and wy to some new pu with 

Os Os 
ma DA A, M, ý SOS A, M, 
AET v) b= 5M, Y) 

and transforms the H into e(A, M). 

The above transformation is “globally” defined because one can show that 


S(A’, M, 27) = 2M. 

Os 

> ƏM 
giving S and by comparing it to the integral for computing w(A, E) explicitly and then using 
25 E JAE ; so S(A’, M, 2r) = 2r M +g(A') for some function g(A’). But M = 0 means 
E = Eo(A'); hence, L = 0; hence, S = 0; hence, g(A’) = 0. This means that when (th, p) 


vary on T?, vy’, u) also vary on T?.) 


In fact (A, M, 27) = 2r (since this can be checked directly by differentiating the integral 


The following problems provide a simple example of how to use the canon- 
ical formalism for a concrete application. A more complete treatment of the 
problem will be presented in Ch. 5, as an application of perturbation theory. 


8. (Solar precession Hamiltonian) Imagine that the Earth € is an ideally rigid homogeneous 

solid of rotation with equatorial radius R. Assume that the center T revolves on a purely 
Keplerian orbit t > r7(t) and, see Fig.4.10, fix the frame i,j, k to be with center T and with 
k axis orthogonal to the plane of the Earth orbit, while the i axis is at the equinox line at 
a prefixed time (epoch). Show that the motion of the Earth is described in the coordinates 
(0, @, Y) of problem (1) above, by the Lagrangian: 


Te pnt hie 1,42 Sy PS? E7 kMs dx 
L= -J(pcosð +y)? + -I0 +% sin? 0) + — — 
Ne y) Ez p ) ERE! 


with J = I3, I = I, = I2 being the Earth inertia moments, Mr, Ms being the masses of the 
Earth and of the Sun, k being the gravitational constant and |E| being the Earth volume: in 
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the case of an ellipsoid with polar radius (1—17)R it is J = (2/5)R? Mr, I = J(1—n+7?/2). 
(Hint: show that J, Is are the appropriate inertia moments and remark that in the given 
geocentric frame of reference the axes have a fixed orientation; hence the inertia forces 
(constant per unit mass and due only to the drag, as the Coriolis force vanishes) have 
vanishing moment with respect to T, by symmetry. Hence in the chosen comoving frame 
we have an ideal solid body subject to the gravitational attraction whose potential, in a 
configuration respecting the constraint of rigidity, is precisely the above integral). 


9. Show that the integral in the Lagrangian of problem (8) can be written: 


3kMsMr f (rr:x)} 1 dx R 


-V =01(t)+ arx O X o 
MeS a fee e e e 


)") 


where C(t) is a suitable function of t. (Hint: By Taylor expansion: |rp + x|~! = |r7|~? — 
Irp|—2(rr - x)/|rr| + (8(rr - x)?/|rr|? — x?)/2|rr|? + O((R/|rr)*); developing to fourth 
order one sees that the third order also vanishes.) 


10. Show that V in problem (9) can be written: 


3kMs I-J 
Qrrl3 J 


3 kM, 
S mJ cos? a 
2 |rr|3 


V = C2(t) 4 J cos? a = C2 (t) 


where C2(t) is a suitable function, ag= angle between the symmetry axis 13 and the vector 
rr, and m = (J — I)/I. (Hint: just compute explicitly the integrals over x in problem (9).) 


11. Let i be as in proble (8). Then the angle @ is called the precession angle since the 
equinox fixing epoch. If the Earth longitude (i.e. the angle between the position rr and i) 
is Ay, then the apparent longitude is Ar — Y. Show that: 


cosa = — sin f sin(Àr — p) 
(Hint: write i3 = (sin ĝ sinp, — sin 0 cos Y, cos 8) and rr/|rr| = (cos àr,sin àr, 0), in the 
geocentric frame, and compute the scalar product). (Hint: Compute explicitly the integrals 
over x in (9).) 


12. Using fig.4.11, 4.12, 4.10 and the trigonometric relations for general spherical triangles 
in problem (3), plus the second of the following other identities of spherical trigonometry: 


sin C cos 3 = cos B sin A — sin B cos A cos y 


cos A cos y =sin A cot B — sin y cot 8 


show that the inversion in (2) can be actually performed via the relations: 


L 
cos ô = ay cos ĝ = — 
A A 


cot(% — y) =(cos pcos ô + sin ô cot 0) / sin p 


cot(w — w) =(cos y cos 0 + sin 0 cot 6)/ sin y 

sin ð =sin or 
sin — 7) 

(Remark: to check the two spherical identities in problem (3) and the two above simply 

draw the spherical triangle putting the vertex 2 with the angle y on the z axis, the vertex 

1 with the 8 angle on the xz plane so that the three vertices are expressed in cartesian 

coordinates as rı = (cos A,0,sin A), r2 = (0,0,1) and r3 = (sin B cos y, sin B sin y, cos B). 
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Then observe that ri -r3 = cos C, that r1-rgArg3 has to be symmetric in the interchange of 
the role of (A, a), (B, 8), (C, y) and that r1 Ar3-j = — sin C cos 8; the three latter relations, 
after computing the left hand sides in cartesian coordinates for r; yield, respectively, the 
two identities in problem (3) and the first of the above two; the last identity is a consequence 
of the first and third.) 


13. Using the coordinates (Kz, A, L, y, p,p) show that the Hamiltonian describing the 
above system for the theory of the solar precession is: 


TEA”. AP SL? 3kMs R 
P ac ee p J cos? O((—_)4 
z. 7t + rE cos” ag + (a ) 


and, setting ņı = (J — I)/J, n2 = (J — I)/I and neglecting O((R/|rr|)*), it becomes: 


J cos? ag 


2 AR (55) _. 3kMs 
T N2 J T GPE 


so that in the case of an ellipsoidal Earth: ņı = n — n? /2, n2 = n +n? /2 + O(n"). (Hint: By 
problems (10),(11) the term V added to the Lagrangian depends only on the coordinates 


0, P, Y): hence the conjugate momenta pz, pp, p7 are given by the same expression as when 
p Jug Pg: Pp: Py g y 


V = 0, see problem (1). Therefore in the (pg, pg, Pp 6,9, Y) variables the hamiltonian is 
simply the same hamiltonian with V = 0 plus V expressed in terms of the new variables and 
of time. Finally the map (pq, pg, Py 6,5, Y) — (Kz, A, L, y, p, Y) is a completely canonical 
time independent map; hence the hamiltonian in the last variables is just the old one 
evaluated in the new variables.) 


14. Using proble (12) show that the Hamiltonian H, can be written: 


- [sin(Ar — y) (cos ọsin 6 cos 6 + sind cos o) — cos(àr — y) sin @ sin el) = 


1 A? A? — L? (s 


A 


2 
- (= 5)? cost Ar = 7) singl?) 


15. Suppose that the excentricity of the Earth orbit is neglected (i.e. that the orbit of the 
Earth is taken circular with radius a equal to the major semiaxis of the ellipse), show that 
the average over the angles y, y and over time t of Hp is: 


Hp = H n2 Pi | a3 JI 


2J 2J ae a vata" 


— £ A? — L? (35 i 
i A242 A2 A21 A2 


K2 L? 1 1 K2 L? 1 =) 


if a is the major semiaxis of the Earth orbit, with an error of order O(ne). The latter 
hamiltonian describes the motion ove time scales large compared to those if the slowest 
period in the non averaged Hamiltonian. 


16. Suppose that A = L (i.e. neglect the non alignment between the Earth axis and the 
angular momentum), so that y = P. And, furthermore, assume that the hamiltonian Hp can 
be replaced by Hp for the purpose of evaluating the average motion over many periods of 
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revolutions (see Ch. 5, 810/12 for a more rigorous treatment). Then show that the precession 
angular velocity would be: 


— 
oR © eoar BO 


In this approximation the angles 6,6 are constant and, having negleted 0, 6 has the inter- 
pretation of inclination angle io of the Earth axis. Using the Kepler’s law: kMg/a? = wra, 
if wp = 2r/T is the angular velocity of the mean anomaly and T is the period of the 
Earth revolutionand if wp is the angular velocity of the daily rotation, show that the solar 


precession rate is: 
AS = 2 mw? JKz = > nı ial wr cos io 
p 2° T A? 2° wp 


the fact that is negative is often referred as a retrograde precession. Show also that the 
period of precession, is T = —2n/d3 = T(wpT cosig)/3mn, or since T = 1. year and 
n = 0.00335281, Te. ~ 7.94 10*years. (Hint: A = Iz3wp, in the suggested approximation, 
and K, = Acosig. Use then the connection (4.10.18), i.e. the third Kepler’s law, between 
the Earth axis a, the period T and the gravitational constant kMg: the relation A = I3/wp 
is correct only to a first approximation, evaluate the exact value, still neglection the 6, and 
check that this is really neglegible. Also the relation between the period and the gravitational 
constant is correct if we neglect the ratio of the masses My/Mg: check that if we do not 
want to neglect it the correction would be an extra factor (1+ Mr/Ms)). 


17. A rough analysis of the lunar precession can be made assuming that the Moon is on 
the ecliptic and that its orbit is circular. Show that the solar precession analysis can the 
be applied to the Moon influence and that the lunar precession would be, if Mz,az denote 
respectively the Moon mass and the radius of its orbit: 


3kMr JKz 
=A = + O(me?) = 13 (—)3 
% E a 2a3 A? (me) p(T Ms 


so that the total luni-solar precession would be: 


=f 045 2 58 a 3 Mr s 
Ap = AS +0 w(t (S) mes 


Evaluate the total rate of lunisolar precession in the above approximation and show that 
it gives Tp ~ 2.51 10* years (get the data from appendix P), or a yearly precession of the 
equinoxes of ~ 51.6” per year. So that only 1/3 of the luni-solar precession is due to the 
Sun. Show that even assuming that Jupiter gravitated around the Earth on a circular orbit 
its contribution to the precession would be much smaller (Hint: with obvious notations it 
would be a fraction of the order of (a/a;)?M j/Msg, i.e. O(1075) of the solar precession). 


The observed value of the lunisolar precession is however 50.38” per year: the dis- 
crepancy is due to the crudeness of the approximations in the model. A more accurate 
calculation (Laplace) leads to a formula which was in fact used to determine 7 from the 
known precession rate, in terms of the masses of the Sun and of the Moon. Check that 
corrections come from several sources: 

(1) the eccentricity of the Moon orbit, the inclination of the Moon orbit and the eccentricity 
of the Earth orbit have been neglected. 

(2) the center of mass of the Earth-Moon rather than that of the Earth revolve about the 
Sun on a keplerian orbit. 


18. Correct the above theory for the eccentricity of the Earth. This means that, looking 
for the motion on scales of time large with respect to T, i.e. the period of revolution, we 
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do not write (a/|rr|)° cos? ag using the approximations àr = wrt + const and a/|rr| = 1, 
but we use the Kepler laws (see problem (13), p.304), i.e.: 


|rr| =a(1 + ecos £) 
E =\— esin À + (e? /2) sin 2A 
AT =A — 2e sin À + (5/4) sin 2A 
A =wrt + const 


and then evaluate the average of (a/|rr|)? cos? ag over y and A, still neglecting 0, i.e. 
(1 — L?/A?). Show that the result is: 


(i+ ecos £)? sin? (Ar — y=- 4e?) 
A? A? — L? 3 Ease 


H = — ———— = Igw2.(1 — 
I, + OTs tmz sw ( 


19. Correct the Moon contribution to take into account the eccentritcity ez and the incli- 
nation iz of the Moon orbit. Calling 0z, YL, Yz he Euler angles of the frame with jz axis 
orthogonal to the Moon orbit, nz the node of the Moon orbit with the ecliptic, x axis, 
imathz pointing to the actual position of the Moon, and if az is the angle between the 
Moon Earth axis ry, and the Earth axis, it is: 


—cosaz, = cos wz sind sin(y, — P) + cos wz sin 0 cos(yr — P) + cos wy — sin Oz cos O sin yg 
and then chech that the average ((az/|r z|)’ cos? az) is, still neglecting 9 is: 


3 3 
(+ ale a sin? 6z) 


(Hint: Write the coordinates of rz and 23 in the fixed frame as: 
13 =(— sin 8 cos G, — sin 8 cos P, cos 0) 
rz /|rz| =(—sin wz cos 8z sin yr + cos Yr cos 9L, 
sin wr cos 67, cos pr + coswz sin pz, sin 8z sin wy) 
and then use that the motion of the Moon is Keplerian for wz and a uniform precession for 
PL). 
20. Put togheter the last problems to check that the lunisolar precession is given by: 


3 Msja\3 3 3 19. 
wp = (a H 5T) H M, (E) (14 wale = sin?iz)) 


and check that the data in appendix Q give —wp = 51.51” per year. Further corrections 
can be found by avoiding to take averages over time scales of the order of the year (theory 


of nutation). 


4.12 Integrable Systems. Geodesic Motion on the 
Surface of an Ellipsoid and Other Systems 


In general, given a closed regular surface X C R4, the “geodesic motions” are 
the motions which a unit mass point can undergo on X, when it is ideally 
bound to X and subject to no other active forces. 
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The Lagrangian of such motions is 


L= 5 (4.12.1) 


and energy conservation, §3.5, implies that 


1 
T= ae = constant (4.12.2) 


on the considered motions. For instance, the set of the motions with initial 
speed of modulus 1 consists entirely of motions in which the speed has modulus 
1 at all times. 

The “geodesic flow” on X is the flow on S = {data space for the motions 
on X} = {set of pairs (n,x) with x € X and 7 compatible with the constraint, 
i.e., tangent to ©}! which to every point (7,x) € S associates S;(7,x) € S, 
the configuration into which the datum (n, x) evolves in time t under the only 
influence of an ideal constraint to X. 

Since |x| = constant, there is an a priori bound on the distance that 
a point can travel in a given time and, therefore, the geodesic flow is well 
defined, Vt € R. 

Speed conservation has an interesting consequence: the action of a geodesic 
motion t — x(t) computed between tı and t2 can be expressed in terms of 
the curvilinear abscissas on the trajectory on which x moves. If V = |x|, 


9 


2 

At, t. = {action of x between tı and t2} = Viewty) = Ti — sı) 

(4.12.3) 
By the least-action principle, Proposition 8, §3.5, p.163, we know that the 
motion t > x(t) makes the action locally minimal in sufficiently small time 
intervals. 

From this, it follows that the trajectory Z, as a curve in R, makes the 
distance between x(t,) and x(t2) measured along X locally minimal if tz is 
close enough to tı (“Maupertuis’ principle”, see problems to §3.11). In fact, 
given x; = x(t1) € Z, suppose that for |t2 — tı| < £, the action Aj,:,(y) is 
minimal on X as y varies in Mz, +, (x(t1), x(t2); X) = {motions on X defined 
for t € [t1,t2] and leading from x(t1) to x(t2)}. If there existed a curve C2 
connecting x(t,) with x(t2), lying on X and shorter than (s2 — s1) = {length 
of the part of T between x(t;) and x(t2)}, then one could run it with uniform 
speed starting from x(t;) at time tı so as to reach x(t2) at time t2. 

Such a motion xc, E Mun 4.(x(t1), x(t2); X) would have an action 


1, |Cia| \2 1 |C12l? 
Ab, ts (Xc,2) =) (t2 = tı) — ars 
oo iy RO 1 (4.12.4) 
<9 (s2 om 81)°te -t = zV (te - tı) = 5 (2 — sı) 


11 In fancy language, call this the “tangent fiber bundle” to X. 
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as |C1 2| < s2 — sı = {length of Z}. This contradicts the minimality of At 2, 
on x. 

The curves on a surface X which make minimal the distance between the 
points that they connect provided such points are close enough, are called 
“geodesics” on X, and this explains the name given to the motions with La- 
grangian (4.12.1) on X. 

The simplest nontrivial example of a geodesic motion is the motion on the 
surface of the sphere in RÌ. The possible trajectories of this motion are great 
circles. It is possible to interpret this statement in terms of the integrability 
of the geodesic motion on the surface of the sphere in the sense of Definition 
10, §4.8, p.287. In this case, the motions are all periodic (see Observation (5), 
p.288). 

A less simple example is the motion on the surface of the ellipsoid. We 
shall only treat the case of the ellipsoid of revolution. However, the motion 
on an arbitrary ellipsoid is also integrable (see problems at the end of this 
section for a glimpse of the theory). 

In the case of an ellipsoid of revolution, we choose as z-axis the symmetry 
axis of the ellipsoid € and determine the position on € of a point through the 
two coordinates (0, p) as 


x =asin cosy, y=asinésiny, z= bcos8, (4.12.5) 


where a and b are the principal semi-axes of the ellipsoid. The Lagrangian 
(4.12.1) of the geodesic motion on € can be written by Eq. (4.1.5) as 


1 n 
L(0,b,0, p) = slo sin? 0 + a? cos” 0) 0? + a° ġ? sin? 6]. (4.12.6) 


So that the equations of motion are 


d 

pri ġ sin? 0 = 0, (4.12.7) 
dno. 2 2 2p E 

a sin“ 0 + a“ cos“ 0)0 = 0 (0, %, 8, p) (4.12.8) 


However, it is convenient to discuss only Eq. (4.12.7), combining it with the 
energy conservation principle: 


1 . 
sl sin? 0 + a? cos 0) 6? + a?¢? sin? 6] = E (4.12.9) 


Equations (4.12.9) and (4.12.7), which we use to define the prime integral 
A= ġ sin? 0, yield 


; 2E sin? 0 — a? A? def 
CSE annaa ee Vea 4.12.10 
sin? 0 (b2 sin? 0 + a? cos? 0) z,A(9) ( ) 


328 4 Special Mechanical Systems 
which, by the usual argument, implies that t — 6(t) is periodic with period: 


04(E,A) do 
Tı(E, A) = 2 | po ee (4.12.11) 
6_(E,A) y—Ve,a (0) 
where 0- (E, A) and 04 (E, A) are the two solutions of Vg, a = 0 of the form 
04+ (E, A) = 5 + 00(E, A) or 6(E, A) = —3 + 0o (E, A) and 69 = arcsin( 44). 
Furthermore, @ verifies the equation 


VE, A 
a) 
and, therefore, it is a C% function of t (see §2.7) and can be expressed in terms 
of the solution t > R(t, E, A) of Eq. (4.12.12) with initial data R(0, Æ, A) = 


6_(E, A), R(0, E, A) = 0. Such a function is defined, recalling §2.7, by 


öğ = 


O (4.12.12) 


R(t,E,A) 
t al = (4.12.13) 
0_(E,A) —Vg a (0) 
for 0 < t < RA), it is continued naturally for TEA < t< T(E, A). 


Furthermore, (t) is given by 


A(t) = R(t + to(90, ĝo), E, A), (4.12.14) 
where to(00, ĝo) = {first time when the motion t > R(t, E, A) reaches (ĝo, 90), 


with 6o = 8(0), 6(0). 
As one sees, the analysis of this problem by “quadratures” is entirely 
analogous to the ones seen in §4.9-§4.11. As on those occasions, the motion 


t — y(t) can be deduced from Eq. (4.12.7) by a quadrature, 


t 
A 
th= —.—_dt 4.12.15 
p(t) vo + f TIO ( ) 


It can be treated, as already seen in §4.9-4.11, by noting that 


A 2 Qnik + 
Se A, Ejen ED 4.12.16 
sin? R(t, E, A) 2. xXe(A, E) ( ) 
by the periodicity of R and by Fourier’s theorem, as t — I REA is a 
Tı (E, A)-periodic C%-function with Fourier coefficients (xk (4, E))kez. Set- 
ting 


to eTA 
S(t,E,A)= X xn(A, E) (4.12.17) 


the quadrature (4.12.15) yields 
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v(t) = po + Xo(E, A)t + S(t + to(O0, 90, E, A) — S(to(00, ĝo), E, A), (4.12.18) 


Hence, from Eqs. (4.12.14) and (4.12.18), we can conclude that all motions 
are quasi-periodic with periods Tı (E, A) given by Eq. (4.12.11) and 
Tas OE ne eae A (4.12.19) 
xo(A, E) O4(BA) Ad 
0-(B,A) 376 J Ve a0) 
after changing variables 0 = R(t, E, A) along the lines already seen in Propo- 
sition 24, §4.11, p.314, and Proposition 22, §4.9, p.293. 
It could be checked that as E, A vary the two periods Tı (E, A), T2(E, A) 
will generally have an irrational ratio. 
The above analysis basically achieves the proof of the following propo- 
sition, (if one disregards the checks of regularity and invertibility in suit- 
ably large regions W’ in the data space of the map I(6(0), (0), (0), p(0)) = 


(E, A,a, b) with a = Tta-ayto(9(0), 6(0)), 8 = (0) — S(to(A(0), 0(0)), £, A): 


25 Proposition. The set W of the data for the geodesic motions on an ellip- 
soid of revolution E and such that E £ 0, A 4 0 can be covered by sets W” C W 
on which the motions are are integrable in the sense of Definition 10, §4.8, 
p.287. Such motions are quasi-periodic with periods T;(E, A),T2(E, A), given 
by Eqs. (4.12.11) and (4.12.19). If the ellipsoid semi-axes are different the 
motion is generally quasi periodic and non periodic. 


Observations. 

(1) The discussion preceding Proposition 25 is very general and could be 
repeated with essentially no change to cover very general classes of surfaces 
of revolution like those parametrically described by equations like 


z= f(0), x = g(0) cosy, y = g(0)sin y (4.12.20) 


for (0, p) € [0,27] x [0,27], with f,g,€ C°(T+) such that the curve in R? 
with parametric equations € = g(0), n = f(0), 0 € [0,27] is a simple closed 
curve symmetric under reflection around the 77 axis. 

Other surfaces covered by the above method are those with parametric 
equations 


z=a(y), x = b(y) cosy, y = b(y) sine (4.12.21) 


for (p, Y) € [0, 27] x [0, 27], with a,b € C® (T1) such that the curve in R? with 
parametric equations 7 = a(y),€ = b(y), y € [0,27] is a simple closed curve 
contained in the half-plane € > 0. The reader can check the above statements, 
as an exercise on the quadrature method. 

(2) Surfaces like Eq. (4.12.20) generalize the ellipsoid of revolution while those 
like Eq. (4.12.21) generalize the “torus of revolution”: given a,b > 0,a > b, 
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x = (a+b cosy) cosy, y = (a+b cosy) sin Y, z =b sing. (4.12.22) 


We conclude this list of remarkable integrable systems by citing a few other 
systems integrable on suitable regions W. 
(1) A point mass on an ellipsoid of revolution, with symmetry axis along the 
major axis of the revolving ellipse, subject to a force with potential energy 


g 
V(x) = — A, 4.12.23 
(x) EA ( ) 
where f}, f2 are the foci of the ellipse generating the ellipsoid. 
This system can be integrated by the quadrature method of §4.9-§4.11, and 
one obtains similar results in elliptic coordinates defined in terms of Cartesian 
coordinates (x,y, z) of x as 


Vr? +y? =0vy(E-—1)(1-n?), z=o€&n, azimuth of =% (4.12.24) 


where € € [1,+o00],7 € [-1,1], y € [0,27] and the parameter ø has to be 
chosen so that the considered ellipsoid is a € = constant surface. Such surfaces 
are the ellipsoids 


z2 r2 + y? 
oe PEI 


(2) A unit mass on a sphere with potential energy in polar coordinates: 


=1. (4.12.25) 


U (0, p) = (8) + ae (4.12.26) 


with b,c C™-periodic functions with period 27. This system is integrated by 
quadratures by writing its Lagrangian function in polar coordinates and dis- 
cussing the Lagrange equations. 

(3) A solid body with a symmetry axis fixed at a point 0 of this axis, which 
we call iz, different from the center of mass G and subject to ideal constraints 
plus the weight, i.e., a force m;g on the i-th point (equivalent by Observations 
(5) in §3.2, p.148, to a force Mg, M = 5°, m; applied to G as far as the force 
momentum calculation is concerned). 

This system (“heavy gyroscope”) is also integrable by quadratures: proceed- 
ing as in §3.11, choose the fixed reference frame (O;i,j,k) with k axis anti 
parallel to g, and write the Lagrangian function in terms of the Euler angles. 
The Lagrange equations can then be combined with the conservation laws, 
for energy and for the k component of the angular momentum, to reduce 
the problem to that of the analysis of one-dimensional systems, i.e., to the 
quadratures. See, also, problems at the end of this section and problems to 
§3.5 in [28]). 

(4) Two more difficult classical integrable systems are the geodesic motions 
on the surface of a non symmetric ellipsoid (see problems at the end of this 
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section for an introduction to this theory) and the motions of a heavy rigid 


body with a fixed point O, with the baricenter in the i,ig plane, say on the 


i A h r . p d 
i, axis at distance a from O, and with inertia moments Jı = Ip = 2I} a 21. 


Such systems can be shown to be integrable by quadratures (as discovered by 
Jacobi and Kovalevskaya, respectively, see problems). 
(5) Other systems are N point masses on the line R with Lagrangian functions 


N N-1 
Hes, m 1 ae a(x;—2i41) 
b= 5 > ti —g > e ta (4.12.27) 


called the “Toda lattice”, or 


; 1 
b= 5 te aaa (4.12.28) 


called “Calogero lattice” , respectively. These were discovered very recently and 
are also integrable. Some variants of such systems with the same properties 
are also known. 

(6) Obviously, there are other integrable systems: it suffices to perform an 
arbitrary change of coordinates in the Lagrangian functions which we have 
just examined to obtain Lagrangian functions of integrable. 

However, only very “few” other systems are known that have the integrability 
property and that are “interesting”, i.e., not obtained by trivial changes of 
coordinates from those so far listed. Some can be found among the problems 
for §4.12. 

Finally, we remark that all the integrable systems of §4.9-§4.12 could be shown 
to be not only integrable in the sense of Definition 10, §4.8, p.287, but also 
analytically and canonically integrable in the sense of Definition 11, 84.8, 
p.289, in large regions of the phase space. In the problems of §4.10-4.12, the 
main steps towards such a proof are given. 


4.12.1 Exercises and Problems 


1. Integrate explicitly by quadratures the systems mentioned in the points (1), (2), and 
(3) of the list of integrable systems in §4.12. By the Hamilton-Jacobi method, show their 
canonical integrability. 


2. Integrate the heavy gyroscope system (3) p.331, by using the Deprit variables (see prob- 
lems to §4.11). First show that the Hamiltonian (i-e., the energy) can be written in the 
Deprit variables as H(Kz, A, L, y, p, Y) given by 


4&4- L [K-L K2 L2 
2A "2p "|2 rr aaa teen 


where pp = Mgd, M = total mass, g = gravity constant, I = I} = Ig and Ig are the moments 
of inertia. Show also the canonical integrability of this system. 


3. Consider the “Kovalevskaya gyroscope”, see p.331, and show that its Lagrangian is 
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; I.. 
L = 16? + I sin? 0 ġ? + z% + cos 0)? + Mga sin 6 cos y% 
and explicitly write the Lagrange equations relative to the 0, ,Ņ variables. 


4. In the context of Exercise 3, eliminate the w between the Lagrange equations relative to 
y and yw and add the resulting equation to the equation relative to the 0 variable multiplied 


by +i or —i, (i = V—1), successively. Show that the two resulting equations imply + ae = 
-44 =" same function”, where: 


U = (ġ sin@ +ib)? + Mgae™*” sind =V, 
so UV = |U|? = constant and UV is a prime integral. 12 


5. In the context of Problems 3 and 4, show that the Kovalevskaya gyroscope is integrable 
by quadratures on vast regions of phase space. 


2 2 2 
6. Consider the geodesic motion on the surface of the ellipsoid E: a + a + 5 =" 1 
a < b < c. Introduce the local coordinate system (“Jacobi’s system”) described by 


_ [uaea ,. faved ,_ fawn 
B= aaea Y= yban 2 = tamale)? 


for (u, v) = fb, c] x la, b] or (u,v = la, b] x [b, c]. Defining for A E R 
ee 
1G G2 ED) 


show that the kinetic energy is given by 


A= 


T (u,b, u,v) = Lu — v)(A(ujù? — A(v)ù’?). 


Applying the Hamilton-Jacobi method to the Lagrangian system with Lagrangian L = 
T (ù, Ù, u, v), show that the geodesic motion on the ellipsoid admits a second prime integral: 


M(ù, ù, u,v) = (u — v)(vA(u jù? — uA(v)0?). 


(Hint: Write the Hamilton-Jacobi equation in (u, v) variables after finding the Hamiltonian 
function in (u, v) and in their canonically conjugate momenta Pu, Pv: 


o 17,0 1 o 1 
f alc (of 


a + 27 Ga) Hew t Ge) CR =N; 


at 2 


where H(u, v) = (u i v)A(u), G(u, v) = (v = u)A(v), and look for solutions of the 


fuvt)=-Zt+ Yu»), Wu) = alu) + (0) 


The equation becomes 


A() (22) a A(u)(S2)? = (u — v) A(u) A(v) E, 


admitting a family of solutions parameterized by E and a new arbitrary parameter 4: 


12 See [49], Chap. VI. 
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wu, vla, E) = is y EA(u')(u! + a)du' + J V EA(v')(v! + a)dv’. 


Now, applying the canonical transformation G generated by — ft + w(u, Vv, a, E), deduce 
that the trajectories of the motion (geodesics on €) are given by the equation de =c= 


cnstant, i.e., Fa (u) + Fa (v) = 2c if Falu) is a primitive function to A(u) (u + a). 
This also implies that @ is a prime integral. Writing the canonical transformation G,, it is 
possible to express @ in terms of U, V, Pu, Py or U,V, Ù, Ù. The computation gives @ = 


—M(u, 0, u, v)/T (ù, ù, u,v) = —M/E; so M is a prime integral. ([10]).) 
7. Consider the system (“atom in electric field”) 


2 
P g 
H(p,a) = d 


p= (Dx, Py, Pz), q = (x,y, Z) and study it in “squared parabolic coordinates” 


g=-(u?—v?), y= w cosy, z = uvsing 


and show by the method of problem 6 (i.e., by the Hamilton-Jacobi method) that this 
system has three prime integrals and that it can be integrated by quadratures (from [46]). 


8. Consider the Hamiltonian (“ionized hydrogen molecule” ) 


= 2m jaz] la- fa 


with p = (Pr, Py, Pz) € RÌ, a = (x,y,z) € RÌ, fi, fo € R? and study it in elliptical 
coordinates (see Eq. (4.12.24) and (4.12.25)) and show by the methods of problems 6 and 
7 that it has three prime integrals and that it can be integrated by quadratures. Find 


canonical action-angle variables (from [46]). 


4.13 Some Integrability Criteria. Introduction: 
Geometric Considerations and Preliminary Definitions 


Considering the “rarity” of the mechanical systems known as integrable one 
wonders whether it is possible to easily recognize, a priori, the non integrability 
of a mechanical system. 

For instance, the integrability on a region W of the data space S implies 
the existence of ¢-“independent” prime integrals. Therefore, a way of showing 
non integrability might be that of showing the nonexistence of as many prime 
integrals as the number of degrees of freedom. 

In any concrete case, however, it is very difficult to decide whether or not 
a system possesses prime integrals (other than the total energy and its func- 
tions). Poincaré’s proof of non integrability, in a sense stricter than the above, 
of the motion of three heavenly bodies is based on showing the nonexistence 
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of enough prime integrals (also defined in a more stringent way). It is still a 
famous proof (see [38], vol. 1, ch. VI). 

Hence, it is useful to try to identify other special properties of the in- 
tegrable systems to use them as necessary conditions for integrability or to 
formulate sufficient non integrability conditions. 

In the following sections we go through an analysis that will allow us to 
classify the motions of the integrable systems as “simple and ordered” motions 
and those of the non integrable ones as “complex and disordered”. 

Coming back into the frame of mind of §2.21 and 84.8, the notions of 
observable, average values, etc. introduced there have a natural extension to 
the systems with several degrees of freedom. 

We consider an ¢-degrees-of-freedom system described by a Lagrangian 
function 


L(x,x) + ideal constraints (4.13.1) 


regular in the sense of §3.11, Definition 14, and generating Hamiltonian equa- 
tions admitting global solutions, in the future and in the past for all the 
constraint-compatible initial data. 

As usual, we denote S the data space for the system of Eq. (4.13.1). By 
Proposition 18, p.285, it is a regular surface in R? x R where d is the dimen- 
sion of the unconstrained system, usually d = 3N, N = {number of points in 
the system}, see Definition 9, §4.8, p.287. 


12 Definition. The elements of C°(S)!* will be called the “observables” of 
the mechanical system of Eq. (4.13.1). 
Given an increasing sequence t = to, ti,... such that ti ==> +% and given 


f € C™(S), we shall call the “t-history of f” on the motion of (x,x) € S the 
sequence 


(F (St 0, x))) 70 (4.13.2) 
It is the sequence of the results of the successive observations of the values of 
f on the motion starting at (x, x) at times to,t1.... We shorten the notation 


by simply referring to the “f,t)-history of (x, x)”. 
If f e C®(S) and t is a sequence like 


and if (x,x) € W C S, where W is a region on which the mechanical system 
of Eq. (4.13.1) is integrable, then the (f,t)-history of (x, x) is far from being 
an “arbitrary” sequence of numbers. Proposition 6,$4.2, p.251, allows us to 
state, for instance, the following obvious reformulation of its contents. 


26 Proposition. If in W C S the system of Eq. (4.13.1) is integrable and 
f E€ C™(S) is an observable and if t is as in Eq. (4.13.3), the (f,t) histories 


13 See Observation (2) to Definition 7, p.285. 
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of the points (x,x) € W have a well-defined average value, i.e., the following 
limit exists: 


= 1 N-1 
F(x) = Jim > 2 (S x) (4.13.4) 
j=0 


Furthermore, if (A, p) = I(x,x) is the integrating transformation mapping 
W onto V x T°, see Definition 10, §4.8, p.287, and if the (L +1) numbers 


(wi(A),...,we(A),o), with o = a are rationally independent then 


FS cay f F(A, p) de’, (4.13.5) 


having set 


F;(A,~) = f(I-(A,9)). (4.13.6) 


Observations. 

(1) Fy is the observable f in the new (A, 4) coordinates. 

(2) Hence, the non integrability of the system of Eq. (4.13.1) in W can be 
proved by “just” exhibiting a single point (x,x) € W and a single observable 
f whose (f,t) history on (x, x) does not have a well-defined average value. 
(3) However, this criterion is very difficult to apply in practice: the (f,t) 
histories are very hard to analyze in concrete interesting cases and “usually” 
they admit an average value even in non integrable systems. 


The following proposition provides a more geometric integrability criterion 
different in spirit from the one above. 


27 Proposition. If in W C S the system of Eq. (4.13.1) is integrable, the 
closure of every trajectory of points (x,x) € W is a set Z which can be mapped 
continuously and in a one-to-one way onto a torus T° with 1 < s < £, if £ is 
the number of degrees of freedom of the system. 


Observations. 

(1) Proposition 27 is also essentially a way of rephrasing some properties of the 
integrability of Definition 10, §4.8, p.287. In fact the motions of an integrable 
system take place on invariant tori of dimension £ run quasi-periodically. If 
(w1,...,wg) are the pulsations of a given motion and are rationally indepen- 
dent then the trajectory fills densely a set homeomorphic to 7, 14 see Propo- 
sition 4, p.250, $4.2. In general, if s is the number of elements of a maximal 
subset of (w1,...,w¢) consisting of rationally independent numbers, then Z 
will be homeomorphic to 7°. The proof of this fact is left to the reader and 
is essentially described in the hints to the problems for $4.14. 

(2) So to prove non integrability in W, it suffices to find “just” one (x, x) € W 


4 i.e., a set which is a one-to-one bicontinuous image of T°. 
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whose trajectory has a closure which is not homeomorphic to a smooth surface 
or, more particularly, to an s-dimensional torus, s < £. 

(3) The geometric structure of a trajectory of a point mass bound to a surface 
X can be found via Maupertuis’ principle since the trajectory of an energy-F 
motion is a geodesic for the metric dh = \/2(£ — V(€))ds on X, where ds is 
the line element on X and V is the potential energy of the active forces. 

In geometry, some criteria for the existence of dense geodesics on a bounded 
surface with some metric are known (e.g., if the curvature of the metric is ev- 
erywhere negative, there are dense geodesics). So with the help of Proposition 
27 and of the Maupertuis’ principle, some examples of non integrable systems 
can be easily built. 


To obtain deeper insight into integrable systems, it is convenient to restrict 
attention to the “analytically integrable” systems. 

They are connected with some interesting geometrical notions which we 
have to illustrate before continuing the analysis. 

To help the reader avoid getting lost in the labyrinth of the geometric 
concepts that follow, it is better to state our aim at the beginning. Basically 
we wish to define sets G C R4 x T? with “piecewise analytic boundary” (see 
the following definition of analyticity). Such sets have the remarkable property 
that not only are they measurable in the Riemann sense, but also that their 
intersection with planar surfaces are measurable with respect to the Riemann 
measure on the surface. This is a property which might not hold for sets with 
C™® boundary (see Problems). 

We shall need, in an essential way, the above simple property and its 
invariance with respect to some changes of coordinates. There are several 
ways of constructing families of sets and classes of coordinate changes with 
this property. However, none of them seems describable in few words, although 
this fact might seem surprising. It will be an amusing puzzle for the reader to 
try to find (possibly giving up analyticity) some alternative definitions which 
would allow us to retain the substance of §4.14 and 84.15. 


13 Definition. If Q C R? is open and f € C®(N), then f is “analytic” on 
N if Yo E€ 2, Jelo) > 0 such that f can be developed V |E — Eo| < e(&o), as 


0,00 it...+ka g j i? 
ee TSO ma assy 
. 087? =1 kj 


0, 


2 e a kal... ka! 


ky,...,ka 


Ifk = (ky,...,ka) € Z4, define k! = []$_, kj!, (€-€0)* = IIi (& — (€0)s)* 


and O¥ f (€) for ot so that Eq. (4.13.7) will be rewritten as 
sais d 


< +00. (4.13.8) 
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k 
ra= > OE) go), (4.13.9) 


An R4-valued function on Q is called analytic if its components are analytic. 


In the following, we will also need the obvious extension of the notion of 
analytic function f to the case when 2 is an open subset in R? x T and its 
values are in Rf x T”, d+d' > 0, L+V > 0, d,d',4, l >0. 

First remark that a real function on 2 C RI x T” can be “canonically 
extended” to a function f on a set 2 C R x RI by setting 


Q = {pairs (£, n) E R? x R? with (£, n mod 27) = (£, p) € 2} 
iow def 


FE n) = f(E, n mod 27) = f(E, p). 


With the above convention (4.13.10) and with the above restrictions on 
d, d',£,l', we state the following definition (for some examples see Exercises). 


(4.13.10) 


14 Definition. A function on 2 C Rx T| taking values in R! and associat- 
ing to € € 2 the value (x, p) will be called “analytic” on Q if, Y £o € Q, there 
is a function F, “representative of f”, defined in the vicinity of £o, taking 
values in Rf x RË and analytic, such that 


if F(€) = (x, n) then f(€) = (x, p) with g = mod 27r (4.13.11) 


for all E near £o. 

A function f on an open set R C RË x TÄ! taking values in R! x T” will be 
called analytic on QQ if its canonical extension f to 2 is analytic. 

The derivatives of f will obviously be defined as the “derivatives of the canoni- 
cal extension of a representative” and they will be denoted by the usual symbols. 


Observation. If some of the integers £, l’, d, d' vanish, we interpret R x T or 
Rix T” in the obvious way: R° x T? = TP, RP xT? = RP, Y p>0. 


Together with the notion of analytic function, we need the notion of ana- 
lytic coordinates. 


15 Definition. Let U C R? x T” be open and let & be an RÌ x T” -valued 
analytic function defined on an open set Q C R? x RÜ such that: 

(i) E is invertible as a map between U and 2; 

(ii) the Jacobian determinant of Æ never vanishes on Q (“E is nonsingu- 
lar?}; 

(iii) B and Æ! are analytic in Q and U, respectively. Then we say that 
(U, Æ) is an analytic system of local coordinates on U. 


1 Naturally, the Jacobian determinant of & in o is the Jacobian determinant of a repre- 
sentative of Æ near o (see Definition 14, above). 
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IfU C Rİ x TË, V CRI T? are open sets and d + dďd' = d+d, and if 
Æ is an analytic function on U taking values in V and establishing between 
U and V a one-to-one nonsingular correspondence with analytic inverse, then 
we shall say that Æ is an analytic correspondence between U and V. 


Observation. Some among d, d’, d, d, may vanish: see the Observation to Def- 
inition 14. 


It is now possible to establish the definition of an analytic surface. The 
reader should try to make drawings to see the various geometrical objects 
discussed in the following definitions and observations. 


16 Definition. A regular surface X C RI x T" is said to be “locally analytic” 
in an open set U C RI x TË if there is a family of local analytic systems of 
coordinates (Ux, Ba)aca with bases (Qa)aea such that: 

(i) the points of X N Ua are those which in (Ua, Za) have coordinates 3, = 
... = Bata—e = 0 where £ is the “dimension” of X, i.e., (Ux, Za) are adapted 
to X; 

(ii) as a varies in A, the sets Ua cover SNU and A is a finite set of indices. 
If X is a locally analytic surface in U and f is an RÌ x T” -valued function 
on X, we shall say that “f is analytic on X” if it is the restriction to X of 
an analytic function on an open set U D XAU. 

If X CU isa closed set and if X is a locally analytic surface in U, we shall 
say that X is an “analytic surface” (this notion is U independent). 


Observations. E 

(1) If some of the d,d’,d,d vanish see the Observation to Definition 14. 

(2) Examples are discussed in the problems and exercises at the end of this 
section. 


Finally, we define the “analytically regular sets”. 


17 Definition. A closed set G C R? x T” will be called “locally analytic” in 
the open set U C R? x T! if AG is a surface locally analytic in U. 

If G is locally analytic in U and G C U, then G will be called “an analytic 
set” (this notion is U independent). 

A closed set G C R x T? will be called “analytically regular” if there is an 
open set U D G and a family of sets locally analytic in U through which, via 
a finite number of union and intersection operations, one can build G. 


Observations. 

(1) If d or d’ vanish, see comment (1) to Definition 14. 

(2) Any analytic surface is an analytic set (since either 02 = X or X = 
RIxTÄ. 

(3) If Æ is an analytic transformation of U C RI x T” onto V C RIXT”! 
and if G C U is an analytically regular set then Æ(G) C V is also analytically 
regular, i.e., the above notion is invariant under analytic maps. This follows 
from the fact that composing analytic functions, one obtains analytic func- 
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tions.!6 


(4) If X is a surface locally analytic in U and if G C U is an analytically 
regular set also, GM X is an analytically regular set: this is the “invariance 
under the intersection operations” of the analytic regularity. 
(5) Let d > d,d > d' and regard RI x T” as a subset of R x T7 by iden- 
tifying it as the subset of Rix TT consisting of the points (XP) such that 
X = (x, 0), P= (9,0) with (x,y) € Rİ x T and the 0’s denote the origins 
in ae d and RY -4 respectively. Then Rİ x T is an analytic surface in 
x Te 
r G C Rİ xT” is analytically regular, then its “extension G to RIXT d 
= {(x, 9) |(%B) € RI x TT, X= (x,y), P = (p, Y) with (x, p) € G} is 
A regular in RI x T” 
(6) On every regular (or locally aA surface X C RêxT” , one can define 
the “area measure”: if (U, Æ) is a regular (or analytic) system of local coor- 
dinates adapted to X with basis 92, there is a regular (or analytic) function o 
on N2 such that for E ce XVAU: 


area( E) = J o(0,..-,0, Bata—e41;---, Ba+a)dBa+a' -e41 - - - dpa+a' 


(4.13.12) 

provided 2~'(E) is measurable in the Riemannian sense (in this case, one 
says that E is measurable with respect to the area measure). 
If © C RË is a regular surface and (U, Æ) is a well-adapted orthogonal and 
of Fermi type system of local coordinates (in the sense of Definition 12, §3.7, 
p.177, and Proposition 12, p.183) with respect to the scalar product n- x on 
R? one has 


(0, Sak „0, Batd'—e41, sa Bard’) = Vy of (4.13.13) 


essentially by (a very reasonable) definition; ø in the other coordinate sys- 
tems is computed by ordinary coordinate transformations. The simplicity of 
Eq. (4.13.13) provides a further illustration of the notion of “well-adapted 
orthogonal” systems of coordinates of §3.7. 

(7) One may think that it is possible to define something like “C'°-regular” 
sets by simply replacing the word analytic by C% everywhere above. However, 
the property in Observation (4), for instance, would not hold. See exercises to 
84.13. 


The problem of the (Riemann) measurability of sets is not always trivial 
and the interest in the above digression on the definition of analytically regular 
sets rests mainly on the validity of the following proposition. 


28 Proposition. Let X CR? x T” be a surface locally analytic in the open 
set U and let E be the analytically regular set contained in U. 


16 The reader can attempt a proof starting with the £ = 1 case. 
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The set EN X is then measurable with respect to the area measure on X, 
i.e., given £ > 0, there exist two functions xt and x~ of class C® on X, 
0<y7 <x* <1, such that if xg, is the characteristic function of EN X: 


ü x (€) S xl€) <x @,VEE ENE, (4.13.14) 
(ii i Ot @Q)—x- ©) doe <e, (4.13.15) 
Ens 


where the integral denotes the surface integral on X, see Observation (6) above. 


We do not describe the proof of this proposition. Although it is not partic- 
ularly difficult it would require a preliminary analysis of the structure of the 
analytic surfaces and their intersections which, being marginal for us, would 
lead us too far away from our problem of discussing the integrability criteria 
for mechanical systems. 


addcontentslinetocsubsectionExercises and Problems 
1. Show that the function on Tt: y > cos ọ is analytic. 


2. Show that the T!-valued function x — x mod 2r is analytic. 


3. If f € C% (T1) and if its Fourier coefficients can be bounded as FA < Fc*, c< 1, then 
f is analytic on V. Prove this statement. 


4. Generalize Problem 3 to the case of a function on T°. 


5. Show that a “surface” of R relatively closed in U C R and locally analytic in U (U open) 
is, inside U, a union of at most denumerably many points without accumulation points in 
U or coincides with U. Show that a bounded analytically regular set in R is a union of 
finitely many points and closed intervals. 


6. Show that straight lines, planes, half-lines, half-planes, and half-spaces are analytic sets 
in R? and R3. 
7. Show that triangles, polygons, disks and their boundaries are analytically regular in R? 


and in R3. 


8. Show that the regular solids, the spheres, the diedra, the triedra, etc., and their bound- 
aries are analytically regular in R3. 


9. Show that the disk and the ellipse, or the ball and the ellipsoidal ball (i.e. the sets whose 
boundary are the sphere or the ellipsoid), and their boundaries are analytic sets in R? or 
R3, respectively. 


10. Show that a disk in R3 is not an analytic set although it is analytically regular. 


11. Let x1, x2,... be a numeration of the rational numbers in [0, 1]. For every xz, consider 
the open interval with length 2—!—* and center x. Show that the union of such intervals 
is an open set dense in [0, 1] with external measure (in the Riemannian sense) > 1 and with 
the internal measure < $. Call this union A. 


12. Let g E€ CY (R) be a positive function on (-4, 4) and zero elsewhere. Set 


and show that f is positive on A (see Problem 11) and zero outside. Show also that f is in 


C°(R). 
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13. Show that the set Do = {(x,y)|0 < x <1, f(x) <y<+c0C RÅ, with f as defined 

in Problem 12, has a piecewise C% boundary. Show that the intersection between Doo and 

the x-axis is not measurable in the Riemannian sense. 


14. Let v € CY (R!) and consider the surface in R? with equations z = v(x), x € R. Show 
that the “area” of a line element over dz (i.e. its length) is, according to Eq. (4.13.13), 


do = \/1+ (23®)2az. 


15. Let v € C® (R?) and consider the surface in RÌ with equations z = v(x, y), (x,y) E€ R?. 
Show that the area of a surface element over dxdy is, according to Eq. (4.13.13), 


to = sh (PEW, (ED aay 


4.14 Analytically Integrable Systems. Frequency of 
Visits and Ergodicity 


In 84.8, Definition 11, p.289, we introduced the notion of “analytically inte- 
grable” Hamiltonian systems defined on an open set W C REx R! or Rf xT" 
or RE x Ro x T®, 4 + lo = l, in phase space. 

The interest in analytically integrable systems is twofold: essentially all 
concrete integrable systems so far met were analytically integrable (and this 
could be verified with some labor); furthermore, if t = (it,)2,, the (f,t) 
histories of the points in the integrability region W of phase space have a 
well-defined average value for all the f € C®(W), and also for many other 
more singular functions f, for instance for the characteristic functions of the 
analytically regular sets. 

To illustrate this remarkable property, it is convenient to introduce the 
following notions. 


18 Definition. Let W be a subset of the phase space (C RE x RE or RE x T! 
or Rf x ROT®, 6, + by = £L) of an analytic time-independent Hamiltonian 
system. Suppose that the system is analytically integrable on W. 

Let € = (Eo, E11, ..., Ep) be a family of subsets of W such that: 

(i) pHi = W, Ein Ej =0 ifi # j, ie, E is a “partition of W”; 

(ii) Eı,..., Ep are analytically regular; 

(iii) d(E;, E;) > 0 ifi Ż j, i,j =1,...,p. 

Obviously, Eo = W\ Can Ej; is an open set. 

The partition E will be called an “analytically regular partition” of W. Denote 
XE; the characteristic function of the sets Ei, i =,0,1,...,p, and let 


fe(€) =) jxw(§), €eW (4.14.1) 


and, finally, we call “(G,t) history of (p,q) € W” the (fe,t) history of (p,q) 
when t = (ti)%29 is a divergent monotonic sequence. 
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Observation. The (€,t) history is a sequence of integers between 0 and p: the 
k-th element of this sequence simply indicates into which set among those of 
E the point Sı, (€) falls, if t > S;(€), t > 0, is the solution to the Hamiltonian 
equations with initial datum € = p,q). 


The following proposition is very remarkable. 


29 Proposition. Given an analytic Hamiltonian system analytically inte- 
grable in the subset W of phase space, the limit 


1 N-1 
lim > 2 x5 (Sye, (€)) (4.14.2) 


exists, Vt, > 0, VE E€ W and for all analytically regular subsets E of W. 
This limit will be called naturally the “frequency of visit to E by the mo- 
tion starting in €” with respect to the sequence of observation times t = 
(iti) 0, tı > 0. 


PROOF. The image I(E) C V x T° of E via the analytic integrating transfor- 
mation I, see Definition 11, §4.8, p.289, will still be analytically regular, see 
observation (3) to Definition 17, p.338. Since for I(€) = (A, ẹ), 


I(S:€) = (A, p +w(A)t), (4.14.3) 


the proof of the above proposition is “reduced” to the one contemplated in 
the following one. 


30 Proposition. Let w = (w1,...,we) be an ¢-tuple of real numbers and let 
E C T° be an analytically regular subset of T°. If t = (iti), tı > 0, the 
frequency of visits 


N-1 
ve(~) = im > > XE(~ + wt) (4.14.4) 
exists, Yp € T°. Furthermore, if the numbers w1,...,we and o = = are 
rationally independent, 
(p) ES J dy’ (4.14.5) 
V = . . 
E\P OT) Jre P 
PROOF. We shall only treat the simple case when (w1,...,we,o) are ratio- 


nally independent because it is easy. The general case can be reduced to this 
one with some patient though interesting work which we leave to the reader, 
referring, as a guide, to the sequence of problems at the end of this section. 
The idea of the proof is to use the Riemann measurability of Æ (conse- 
quence of its analytic regularity, see Proposition 28) to find two C% (T°) func- 
tions y~ and y+ verifying Eqs. (4.13.14) and (4.13.15) to infer that vp (yp), if 
existing, must be between the averages of y~ and x™ which, in turn, exist and 
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differ at most by £ by Eq. (4.13.15) and Proposition 6, §4.2, Eq. (4.2.10), p.251. 
Then the arbitrariness of € implies the actual existence of vg(p) and the fact 
that it is between the averages Gay Jee x*(~’) dy’. Again the arbitrariness 


of e and Eqs. (4.13.15) and (4.13.14) imply Eq. (4.14.5). mbe 
Observations. 

(1) The reader will note that the analytic regularity of E is used in the above 
proof only to infer the Riemann measurability of Æ. However, if w1,...,we,0 


were not rationally independent the analytic regularity should again be used 
to prove the reducibility of the general case to the rationally independent one. 
This is the reason why the Riemann measurability is not in itself a general] 
sufficient condition for the existence of the limit of Eq. (4.14.4). See problems 
at the end of this section. 

(2) Of course if w1,...,we,o are rationally independent the Riemann measur- 
ability of E suffices, alone, to deduce Eqs. (4.14.5) and (4.14.6) as it appears 
clear from the above proof. 


So every motion of a Hamiltonian system analytically integrable in W 
visits an analytically regular set Æ with a well-defined frequency of visit. 
One can wonder about the frequency of joint visits to two given analytically 
regular sets E and EF’. The remarkable fact is that they are, on the average, 
“independent”. The frequency of a visit to E followed j time units later by a 
visit to F’ is, on the average over j, equal to the product of the frequency of 
visit to E and of that of BE’: VE € W, Vt, > 0, 


N-1 
lim L XC vens; (œn) (€) = ve (€)ve (£). (4.14.6) 
j=0 


In other words, visit to E by a given motion does not put any restrictions on 
the possibility of a visit to E’ j time units later, at least on the average on j. 
This is the content of the following proposition. 


31 Proposition. In the assumptions of Proposition 29, let E, E' C W be two 
analytically regular sets. Then property (4.14.6) holds for all € € W, Vt, > 0. 


Observation. This proposition is a corollary of the following Proposition 32 
on the quasi-periodic motions on 7“ in the same way in which Proposition 29 
appears to be a corollary of Proposition 30. 


32 Proposition. Let E,E’ C T° be two analytically regular sets and let 
w E R£, tı > 0. Denote E'+tw the set of points yp! +tw mod 27 as y' varies 
in E" (E' + tw is the set into which E' evolves in time t under the quasi 
periodic flow on T* with pulsations w). If vele) is the frequency of visits of 
the points p+ jtijw, j =0,1,... to E, it is, Vp eT, 


N-1 

he cal 

lim y Yo venEtinw lP) = velp)ve (p). (4.14.7) 
j=0 
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Observations 

(1) When w,...,we and o = an are rationally independent ve(y) is the 
measure of E, see Eq. (4.14.5). Hence, Eq. (4.14.7) means that in this case 
the fraction of E occupied by images of points of E’ is a fraction of E equal, 
on the average, to the measure of F”. In other words, E’ + jtjw is uniformly 
scattered in 7“, on the average. This holds for Y E’ analytically regular. 

(2) By considering the case @ = 1 and taking E and E’ to be two small 
intervals, one sees that the limit of Ven(E'+jtiw) (P) as J > œ does not exist 
in general: even in the case of rational independence of w, = the average over 
j in Eq. (4.14.7) is essential. Therefore, even though on the average E’ + jtiw 
is uniformly scattered in 7‘, it is not true that for large times j this set is 
uniformly scattered. This is due manifestly to the fact that the rotations of 
the torus are “rigid” transformations and they do not “mix” the points of T°. 


PROOF. As in the case of Proposition 29, let us only treat the simple case 
when w ,...,we and o = ot are rationally independent. The general case can 
be treated by solving the last of the problems at the end of this section. 

Proceeding as in the proof of Proposition 30 and using the Riemann mea- 
surability [see Eq. (4.13.15)] of the sets E, E’, the problem of proving Eq. 
(4.14.7) is reduced to that of proving, Y f,g € C~(T), 


N-1 
lim F D HEH = FeV ale) (4.14.8) 


where the bar over a function of p denotes the average: 


f(¢) = lim WL Met jw), (4.14.9) 


Note that Eq. (4.14.8) would directly become Eq. (4.14.7) if one could take 
f=Xe,9= "8. 

To prove this proposition, Eq. (4.14.8) shall be applied to the functions 
xt x! + which, according to Proposition 28, approximate yz, yz’ from above 
and to the functions x~, x’ which approximate yz, Xg from below, following 
the approximation idea of the proof of Proposition 30. 

Eq. (4.14.8) can now be checked. By the simplifying assumption of rational 
independence, see Proposition 6, §4.2, p.251, 


Flay n de" F TOAN n ap" a 
KA= f Oga A= foe Ger = He. (4.14.10) 


fas Ga being the Fourier coefficients of f,g, respectively. Furthermore, since 
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flp)g(e + jtiw) = se F Gnieiretn-etitin'w) 


n,n’ 


Bet, aden (4.14.11) 
=e D apie) 
m n+n’=m 
one finds, still by Proposition 6, §4.2, 
f(v)g(y + jt1w) =Fourier coefficient of order 0 
of the function in Eqs. (4.14.11) (4.14.12) 
= so fa Gnetinae 
N-11 F; y7 Nt 
Then N Da. F(v)9ly + jtiw) is 
N-1 : 
oe 1 a X t4 1 et Ntinw re 
Eo: Pns —ijtinw __ ~ ~ = 


and, by the usual argument of passage to the limit under the series sign, 
it follows that the limit as N — +00 of Eq. (4.14.13) is just fogo which 
shows, recalling (4.14.10), the validity of Eq. (4.14.8) and, hence, the above 
proposition validity (in the special case treated here). mbe 


The above propositions imply some simple consequences. 
Let € = (Eo, E1, .. . , Es) be a partition of the phase space W of an analytically 
integrable Hamiltonian system. Suppose that € is analytically regular in W 
in the sense of Definition 18, p.341. Given t = (it1)%2o, the partition € and 
k>0,0< ji <jo<...< jk, define 


} apa: VD def 
E a ae x) = S- jiti (Ear) N S- jot; (Eaa) N -N S_§,t1 (Eaz )- 
(4.14.14) 
This is the set of the points € € W such that 
Sjit (£) € Fons Sijat (£) € Eas, e’ Sint (€) € Fax: (4.14.15) 
From Eq. (4.14.15) and from the fact that € is a partition of W, it is 
Say Ankh ji sss Oh 
E OE =% 4.14.16 
o sae a G soe 1) ( ) 
unless a] = (31,...,a% = By. Also 
0,s : 
ie a(2 pi a =W and (4.14.17) 
Qı eae “AE 
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0,8 i x . o - x 5 ; j 
Ue( Ji +++ Jp—1 Jp Ipti +++ Ik ) =B( aia Jp-1Jp+1 +; Jk ) (4.14.18) 
e GR 


m -- Qp—1 Q Qp+1 .-.. Qk Q1... Qp—1 Qp+1 --- Ak 
It is also clear that if a; # 0,Yi =,..., k, the set E G ra n ) is analyt- 
1 - Qk 


ically regular because the time evolution transformations (St)rer are analytic 
(being such after the analytic change of coordinates I, [see Eq. (4.8.14)] which 
integrates the system,!” and because the analytic image of an analytically 
regular set is still analytically regular [see Observation (3) to Definition 17, 
p.338). 


1 
We imagine that the partition € models an actual observation of some 


physical quantity. The results of the observations, read on a dial, give a finite 
number of results 1,2,...,s or 0 (“off the dial”). Since the results of physical 
measurements can always be numbered from 1 to some s, this is a very general 
model. 

Thus the phase space W is divided by collecting together all the physical 
configurations € € W that produce the same result for the value of the physical 
quantity described by Eq. (4.14.1) in this model. 

Given a sequence of observation times 0, t),tz2,...,t; = jti, we can de- 
cide to record the results of the observations made at times 71¢1,..., 7x1. 
We see that the possible outcomes of such observations are (s + 1)* k- 
tuples (a1,...,a,) and we can partition W into (s + 1)* sets of the form 


The sets E G w : 5 ) have a simple physical meaning. 
. k 


E ( 7 A ) collecting the points falling in Ea, at time jıtı, in Ea, at 
of k 


time Joti, sey in Fa, at time Jeti . 

In terms of the above mathematical notions, it is possible to formulate an 
interesting proposition whose physical meaning can easily be gathered from 
the just discussed interpretation. 


33 Proposition. Let W be the phase space of a time-independent analytically 
integrable Hamiltonian system. Let tı > 0 and let E = (Eo,..., Es) be an 
analytically regular partition of W. Then VE € W, the frequencies of visits 
to the sets of the form of Eq. (4.14.14) by the motion starting at E exist. 
Denoting such frequencies 


j1- -Jk de 
ph EA le) Sva © (4.14.19) 
Q1...Ak Oy. aR 
it also follows that: 

(VRE Z4,V0< jı < j2 <... < jp integers, Y a1, a2,...,a~% in (0,1,...,8) 


17 Here we use a well-known fact that when composing analytic functions, one obtains 
analytic functions. The reader can attempt a proof of this starting with the £ = 1 case. 
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Jise Tk | def 
ee é) (gre 6) G2) 
(ü) Vk € Z4,V0 < jı < j2 < ... < je integers, V a1, a2,...,Q% in 


(0,1,...,8), Vp =1,2,...,k: 


Jı ---Jp—1 Jp Jp+1 +++ Jk J1 ---Jp—1 Jp+1 +++ Jk 
> p( p—1 Jp Ip £) =»( p—1Jp+ £) 
o=0 Q1.+-Ap-1 Q Qp+1 ... Qk Q1... Qp—1 Api1 ... Qk 
(4.14.21) 
(iii) Yk € Z4,Y0 < jı < j2 <... < jp integers: 


D t : > l) =1 (4.14.22) 


(iv) Yk € Z4,V0< jı < jo <... < jk, O0 < i1 < tg <... <in integers, and 
V @&1,@2,...,@k, 61, G2,..., Bh in (0,1,...,8) 


N-1 
| jı- -Jr ire- -ine 
lim — p( le 
Eee a1... Ap By se OR 
ae t1... th 
=p E) Pp £) . 
Chee ) Cay 
Properties (i), (i), (wi), and (iv) will be referred to, respectively, as “pos- 


itivity”, “compatibility”, “normalization”, and “ergodicity” properties of the 
frequencies of the motion “generated by € and observed on E”. 


(4.14.23) 


Observation. It will appear that the above proposition is just a fancy statement 
of the results already obtained. However, it is very useful because it introduces 
a few qualitative notions which are very natural and important. 


PROOF. First suppose the existence of the frequencies of Eq. (4.14.19). Then 
(i) is obvious, while (ii) and (iii) follow from Eqs. (4.14.16)-(4.14.18) and Eqs. 
(4.14.16) and (4.14.17), respectively. 

So it remains to prove the existence of the frequencies and (iv). The ex- 
istence of the frequencies for the sets a) with ay Æ 0,...,a,n 4 0 
follows from their analytic regularity stated after their definition and from 
Proposition 29. It remains, therefore, to examine the cases when some among 


Q1,..-,Q% are 0. 
Proceeding inductively from Eqs. (4.14.16), (4.14.18), note that if k = 1, 


Eq. (4.14.17) and the definition of frequency imply existence of p(4 £) and, 
actually, 
p(” |e) Sie, p(7"|e). (4.14.24) 
0 a 
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In fact, in general, if E is visited with well defined frequency v its complement 
is visited with frequency equal to (1 — v). If k = 2, by the same arguments, 
we deduce that for a > 0, 


p Gag =p (7"\¢) - Ye ie le). (4.14.25) 


1 


Hence the frequency 


e je £) = = ie e)-> (4 A £) , (4.14.26) 


exists for the same reasons, etc., Pens 

Finally, (iv) follows, by Proposition 31, immediately when a; Æ 0,...,a% 4 
0,(1,---,;8n Æ 0, because B(z jio a and P(g kA ) are analytically regu- 
lar and Eq. (4.14.23) is just a transcripton in other symbols of Eq. (4.14.6). 
However the general case, when some of the a’s or 8’s may be zero, can be 
treated in the same way as that used to show the existence of the frequencies 
of visit, see Eqs. (4.14.24)-(4.14.26). mbe 


It is useful to reinterpret Proposition 33 as follows. 
Given € € W, consider the (€,t) history of €, see Definition 18, §4.14, 
p.341. It is the sequence of a = (ao, a1,...), ai = 0,1,..., 8 such that 
Su,(€)€ Ex, +=0,1,.... (4.14.27) 
The frequencies of Eq. (4.14.19) can be “computed” from the history a as 
Si; — 4.14.2 
p( he l) So inr (Dee a), ( 8) 
where ny(...) is the number of values of h, integer and smaller than N, such 
that 
S(htir)tr (£) EE E Ona (£) E€ E)ak (4.14.29) 
[see Eqs. (4.14.15) and (4.14.19)], i.e., it is the number of times when 


An+j, = O1,-+-,Ant+j, = Qk (4.14.30) 
occur simultaneously, with h integer in [0, N). In other words, Eq. (4.14.28) 


ais ae £) is the frequency of appearance of the “string &1,...,@k 


says that p l 
at sites following each other at successive distances j2—J1,93—J2,---;Jk—Jk—1 
in the history a of €. It is then natural to set the following general definition. 
19 Definition. Let a = (a;)%5 be a sequence, a; = 0,1,...,8,Vi E€ Zy. 
Given k > 0,0 < jı < jo <... < je integers and aj,...,az in (0,1,...,8) 
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we say that a “string homologous to (ee ) ” is “realized in a at the h-th 


1- Ok 
site” if 


Oh+j, = Q1, Gh4+jo = Q2,- - -,Ahtjp = Qk- (4.14.31) 


The frequency of realization in a of strings homologous to a will be 


defined in terms of the quantity 


hoss 1 
PN ( eae 2) = a {number of times a string homologous 


to Cae is realized in a at sites h between 0 and N} 
We shall set 
a ee 2) Lr Jim py Gage 2) (4.14.33) 
Q1... Qk N=>œ Q1... Qk 


whenever the limit exists. 

We shall say that a sequence a is “ergodic” if: 

(i) it has well-defined frequencies of appearance for all the strings of symbols, 
i.e., the limits (4.14.33) exist for all choices of the indices; 

(ii) there are at least two distinct symbols a, 3 occurring with positive fre- 
quency ina: 


r(” la) > 0, p(5la >0; (4.14.34) 


N-1 s Pe 3 
: 1 J1++-IkM46-+-th+e ) 
lim — la 
fey Plas .-. bh 


= a 2) Or a). 

Q1...Ak By... Br 

As k, ji,..-,jk,Q1,---,@k vary, the family of numbers (4.14.83) will be called 
the “distribution of a”. 

If a only verifies (i) [or (i) and (ii)], it will be called a “sequence with well- 
defined frequencies” (respectively, a “sequence with nontrivial frequencies”) of 


the occurrence of the symbols. 
Finally, an ergodic sequence is said “mixing” if for all the choices of indices, 


: Jie+-Jktipe-.-th+e Ji+++dk a1... tp 
lim 2) = ( 2) p( 2) ; 
time (2s zabh P Aiai Ak Br... Br 


which is obviously stronger than Eq. (4.14.35). 


(4.14.35) 
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Observations. 

(1) From the definition, Eq. (4.14.33), of the distribution of a as a family of 
frequencies of certain events, it immediately follows that such numbers verify 
Eqs. (4.14.20), (4.14.21), and (4.14.22) with a replacing £. 

(2) Using the language of probability theory (see §2.23), we can say that to any 
sequence a with well-defined frequencies of occurrence of the symbols it is pos- 
sible to associate a family (Ex, px), of probability distributions as follows. 


Er will be the set of (s + 1)* events, which we can denote a = (ao, ..., @k—1), 
aj = 0,1,...,8, whose probability is p („94471 ). By Eq. (4.14.38), this 


probability coincides, by definition, with the frequency of occurrence in a of 
mf 0... k-1 

strings homologous to (= a 

For this reason, the sequence (Ek, pk); is also called the “probability distri- 
Orka la) is called the “probability of 


O---Ak—-1 


bution of the symbols of a” and p < 

he string @ = (ao,...,Q@%—-1) in a”. 
Proposition 33 can be reinterpreted in terms of the above definition: 

34 Proposition. By the assumptions of the preceding proposition, denote for 

E € W the (E,t) history, t = (iti)%o, tı > 0 of € as a(€). Then, if Eq. 

(4.14.34) holds, a(€) is an ergodic non mixing sequence. 


Observation. The only statement not already contained in Proposition 33 is 
the one concerning mixing. 


PROOF. By the assumed analytic integrability of the system, we can imagine 
that a = a(€) is the (€,t) history of a point p € T° with respect to an ana- 
lytically regular partition € = (Eo, . . ., Ep) of T* and to the transformations 
(S:)rer of T* given by 


Sip =yp+tw mod 2r. (4.14.37) 
For simplicity, we shall only deal with the case when w1,..., we, o = an are 


rationally independent and when it is also assumed that there are two sets 
Ea, Eg such that p(? 
there is a point go € Ea at a distance from Eg, greater than twice the 
diameter of Ex. 

These are serious restrictions. However, the general case can be reduced to 
the above, as it will become apparent after having gone through the problems 
at the end of this section. 


a) > 0, pl la) > 0, having a diameter so small that 


The rational independence assumption of w1,...,we,a and the analytic 
regularity of E imply that 
03 1 
p( 3 a) = zyl dp, Yyy (4.14.38) 
YY (27) EN S-—j tı Eg 


[see Proposition 30, p.342, Eq. (4.14.5)]. 
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If a(€) were mixing, by Eq. (4.14.36), one would also have 


lim b a) =p (a) ve (sla) >0 (4.14.39) 
j >o \yy a p 


However this would mean that for j large enough, it should be 


0; 
p( 7 a) af dp > 0 (4.14.40) 

Q B E-NS_; tı Ep 
Hence, Ea N S_;+,E3 4 @ eventually. But, by the rational independence of 
w1,...,We,o the trajectory @ — jt; w, j > jo, of any point © chosen in Eg is 


dense in T* (see §4.2). Therefore, given Yo € Ea, there must exist infinitely 
many values of j > 0 such that the distance of g — j tı w from ¢p is less than 
the diameter of Eg. For such values of j, it must be that Ea N S-jn Eg =0 
since these torus rotations do not deform the sets but they only translate 
them, and ¢p is chosen so that d(yo, Eg) > {twice the diameter of Eg}. mbe 


4.14.1 Exercises and Problems 


Solve the following connected sequence of problems for £ = 2 first drawing 
graphical representations of the various maps and transformations. The nota- 
tions are those of §4.14. The aim is to solve problem 8 below. 

1. Let w1,...,w be rationally dependent and not all zero. Show that there exists Ê < £ 
rationally independent numbers @1,... OF and an £ x @ matrix J with integer coefficients 


and such that w = Jw, i.e., wj = ar BE) pe a beers 


2. In R! consider the plane mp = JR! = {x| aj = SE JikYk, y € RZ and the plane mp 
orthogonal to it. Show that there exists an £ x (£ — £) matrix J+ with integer coefficients 
such that mp Z JER 


3. Define the map (J x J+) of TE x T}? onto T2, V (9, v) € T? x Te“ as: 


(J x J+)(9,v) = (JO + Jtv mod 2r. 


If one defines Ẹ = (EX Jj E, for E C T®, and if E is analytically regular in 7%, show 


that E is such in T? x T4 = T?. (Hint: Note that (J x J+)rT* regarded as a matrix 
denoted (J x J+)r linearly maps R! onto R£; hence, det(J x J+) # 0. Hence, (J x J) )E 
is analytically regular in Rf and E is obtained by considering (J x JLI E, after reducing 


mod 27, the coordinates of its points, as a subset of the torus Tx Te = T.j 


4. If po € T! and po = (J x J+)r(Vo, vo) show that the frequency of visits to E of the 
trajectory of po under the transformation po — Yo + tw coincides with the frequency of 
visit to E of the trajectory of (Vo, vo) under the transformation (9o, vo) > (Vo + © t, vo) 
(Hint: Note that po + wt = (J x J+)r(Vo + Bt), vo) by the construction of J.) 


5. Let È(vo) = EN{(9,v) | (9, v) € Tex Te! v= vo}, then the frequency of visits to E 
of the trajectory of po for the transformations yo — Yo + tw coincides with the frequency 


of visits to E(w) = { 8| V€ TË, (9, vo) € E(vo)} c T? by the trajectory of Vo under the 


352 4 Special Mechanical Systems 


transformation Yo > Vo + @t. Furthermore, if E is analytically regular in T’, then P(o) 


is such in 7’ (Hint: Interpret Ẹ(vo) as the intersection of Ẹ(vo) with a “plane”.) 


6. If w1,...,we are rationally independent but w1,...,we,0 = an are not rationally inde- 
pendent there are £ integers m1,...,mg and q > 0, integer too, such that 
m-w 
oS —, 
q 


The problem of the determination of me frequency of visits to E C T° by the trajectory 
of p € T! under the map » > y+ wzi 2, j =0,1,... is equivalent (via a suitable change 
of coordinates) to the analogous problem. when the relation between o and w is simply 
o = %w. (Hint: The transformation is analogous to that described in Problems 2 and 3 
above. It is the transformation associated, in the same way as above, to the matrix J of the 
transformation 


7. Consider the trajectory of go € T! under the transformations go > yo yer = J w with 


o= Twr, m, q integers, and assume that w1,...,wg are rationally independent. 
Think of 7 as T! x T£—! and, if (p, Y) € T! x T*—!, show that the map under analysis 
can be written as (y,w) —> (p+ 274 5, a + w'j) where w = (wh,...,w)) are £— 1 ratio- 


nally independent numbers which together with 27 form a set of £ rationally independent 
numbers. 

If E C T“ is analytically regular, show that the frequency of visit to E exists and depends 
only on y. (Hint: Note that 


Mm-1 


1 27q. ae 
— J xelpo+ j, ypo +w j) 
Mm j=0 m 
1 m-1M-1 Qnq 
= — J DS xalyo + —(k + mp), (Wo + kw!) + mpu’) 
mM k=0 p=0 m 
1 m—1 1 M-1 on 
== (GS xele + (k + mp), (Yo + kw’) + mpw')), 
m i M pao m 


and, letting yx = po + =k, be = Ho kw’, this can be rewritten 


1 m-1 1 M-1 1 m-1 M-1 
LS (SY xele v + mow’) = = ae xz x(¢0) Un + mp"), 
Mm I= p=0 ™ 7X0 p=0 


where Ex(yo) = {Y| € TE, (vo + 274 k,, ap) € E} is still analytically regular for 
k = 0,...,m — 1. Hence, the frequency of visit to E exists because mw’ has rationally 


+ 
independent components and it is given by + pee os SE, (0) ant = :) 


8. On the basis of the above problems, deduce the proofs of Propositions 30 and 31 in the 


general case, from their validity in the rationally independent cases. 
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4.15 Analytic Integrability Criteria. Complexity of 
Motions and Entropy 


Summarizing the preceding sections discussion, the following criteria of non 
analytic integrability, on a phase space subset W, have been obtained for an 
analytic time-independent Hamiltonian system: 


(i) if in W there is one € whose (€,t) history (t = (it,)%2,) on an analytically 
regular partition € of W contains some strings without well-defined frequency 
of occurrence; 

(ii) if in W there is one y whose trajectory T has a closure T that cannot be 
mapped bicontinuously on a torus T*, s < £, L being the number of degrees 
of freedom; 

(iii) if in W there is one € whose (€,t) history on an analytically regular 
partition € of W has nontrivial frequency distributions but is not ergodic; 
(iv) if in W there is one € whose (€,t) history on an analytically regular 
partition E of W is “too ergodic”, i.e., mixing. 


The review of non integrability criteria will be concluded by examining an- 
other very interesting property of the analytically integrable systems: namely 
that the motions of such systems have a “small complexity”. This leads to 
another non integrability criterion, see (v), p.359. 

To obtain such a result a quantitative meaning is needed for the notion 
of “complexity” of the motions associated with points moving on a regular 
(analytic) surface under the action of a family (“semigroup”) (Stjer, of C° 
(analytic) transformations. 

A natural way to evaluate the complexity of a motion is to count the 
number of different strings of history appearing in the (€,t) history of the 
motion on an analytically regular partition. 


20 Definition. Let a be a sequence a = (a;)2o, ai € (0,..., 5). Assume that 
a has well-defined frequencies of symbol appearances (see p.348). 
The “number of strings of symbols of length k appearing in a” is defined as 


Naps (a, k) ={ number of choices of (ao,..-;@k—1) 


E caheesal (4.15.1) 
€(0,...,8)* such that p ( 2) >of 
Qo -.. Ak-1 


a) denotes the frequency of appearance in a 
is, os eS, a), see Definition 19, p.348 
Clearly Nabss(a, k) < (s+1)*. We shall set'® 


0... RAL 
Qo «+ Ak—1 


where, we recall, p( 


of a string homologous to ( 


1 
Sass(a) = lim Z log Nabs (a, k) (4.15.2) 


—+00 


which we call the “absolute complexity” of the sequence a. 


18 The limit always exists (see Problem 21, p.364). 
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Observations. 

(1) The number in Eq. (4.15.2) can give an idea of how complex the sequence 
a might be. However, Sabs(a) is a rather rough measure of the complexity of 
a: in its evaluation, in fact one puts on the same footing strings occurring in 
a with a frequency of occurrence much smaller than that of other strings or, 
by the Observation (2), to Definition 19, p.348, with a “probability” much 
smaller than that of others. 

(2) The existence of the limit of Eq. (4.15.8) is easy to prove and very instruc- 
tive (see Problem 23 below). 


The following more sophisticated definition takes into account the possi- 
bility that some strings may be present in a with extremely small probability 
and gives them less importance. 


21 Definition. Let a = (ai)iez,, a = (ai)729, ai = 0,1,...,8 be a sequence 
with well defined frequencies of symbol occurrence as in Definition 20. 

Given £ > 0, consider all the possible subsets Ce of the set of the k-tuples 
Qo,---,Qp-1, Qi =0,...,5, such that 


na RA 
D p( 0 ja) <e. (4.15.3) 
i 3 QO .-- Ak-1 


These are the sets Ce of “k-strings” (strings of length k) whose total frequency 
of occurrence is smaller than £. Let 


N (a, k, €) a minimum, over the choices of Ce, of the 


(4.15.4) 
number of k-tuples outside Ce 
1 
ang tet S(a,¢) = lim sup = log N (a, k, £), (4.15.5) 
k—+00 k 
S(a) = lim S(a, £). (4.15.6) 
E> 


This last quantity will be called the “entropy” of a and it can also be regarded 
as a measure of the complexity of a. 


Observations. 

(1) This is a measure of complexity more interesting than Eq. (4.15.2). 
Through Eq. (4.15.4) and the two limits in Eqs. (4.15.5) and (4.15.6), in 
some way, one discards from the number of strings of a those which appear 
with a very small frequency (see, also, Proposition 37 to follow). 

(2) Obviously, 


0 < S(a) < Saps(a) < log(s + 1), (4.15.7) 


and one can note that the two numbers given in Eqs. (4.15.2) and (4.15.6) 
can be thought of as obtained by permuting the following two limits: 
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1 
Sabs(a) = jim lim z log N (a, k, £), (4.15.8) 
—00 E> 
1 
S(a) = lim lim = log N(a,k,«), (4.15.9) 
0 k> k 


if all the above limits exist. 

(3) The term entropy given to Eq. (4.15.6) is due to the analogy of this 
definition with Boltzmann’s fundamental idea on the proportionality between 
the entropy of the state of a system, in the thermodynamic sense of the word, 
and the number of ways of realizing the same macroscopic state by equivalent 
microscopic states. 

This analogy is evident if one is not biased by the various limit steps taken in 
Eqs. (4.15.5), (4.15.6), (4.15.8), and (4.15.9), and at first one ignores them. 


The following proposition holds. 


35 Proposition. Consider a Hamiltonian system analytically integrable on 
the phase-space subset W. Let E = (Eo,..., Es) be an analytically regular 
partition of W. Let tı > 0, t = (¢t1)%2o. 

For all € W, denote a(&) the (E, t) history of €. Then 


S(a(é))=0, VEEW. (4.15.10) 


Observation. As already seen in the propositions of §4.14, the statement of 
this proposition is an immediate consequence of an analogous proposition 
concerning the torus rotations. In this case, the proposition is the following. 


36 Proposition. Let w E€ R! and let (Si)ier be the quasi-periodic flow on 
T° with pulsations w (i.e. Sip = pt tw). Consider the transformations 
(Sjt P20, tı > 0, and let E = (Eo,..., Es) be an analytically regular partition 
of T° into (s +1) sets. The (E,t)-history a(p) of p € T° is such that 


S(a(y)) = 0, VpeT’. (4.15.11) 


Observations. 

(1) The argument presented in the proof below essentially gives the proof of a 
more general theorem of great importance in the theory of entropy (“Koush- 
nirenko’s theorem”, [4]). 

(2) Actually, one could prove a stronger result namely, 


Sas(a(y))=0, VypeT’. (4.15.12) 


However, in the course of the proof, we show Eq. (4.15.12) only in the £ = 1 
case. The argument could be adapted to prove Eq. (4.15.12) in general. How- 
ever, for l > 1, an alternative proof of the weaker result of Eq. (4.15.11) is 
preferable because the method of this proof is in itself interesting and, as 
mentioned in Observation (1), contains the germs of interesting extensions. 

(3) Equations (4.15.10) and (4.15.11) have an interesting monotonicity prop- 
erty: if E’ is a partition finer than € in the sense that every set in € can be 
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thought of as a union of sets in €, then the absolute complexity (and the 
entropy) of a' (4) is not smaller than that of a(w). This reflects the intuitively 
clear fact that by increasing the precision of the measurements, the motion can 
only look more complicated since more of its features may become manifest. 


PROOF: As mentioned in Observation (2), the cases £ = 1 and £ > 1 will 
be considered separately . We only treat the case when w,...,we, ot are 
rationally independent. The problems of §4.14 show that the general case can 
be reduced to this special one. 


Case L = 1. To fix the ideas, suppose s = 1 and Eo = (\,27), Fy = 
[0, A], A € (0.27). Consider the images of the points 0 and A for the maps 
gy > pt jtiwi, j = 0,...,k — 1 There are at most 2(k + 1) points (and at 
least 2) dividing the interval [0,27] in 2(k +1), at most, consecutive intervals 
Jı, Jg,.... Then all points internal to some such interval have the same (E, t) 
history in the first k sites of their history. 

To the 2(k + 1), at most, histories of the points internal to the above 
intervals, we can add the 2(k + 1) histories, at most, of their extreme points. 
We thus obtain all the possible strings of the history with length k that can 
appear in the (€,t) history of a point y € T+. Hence, 


Nave(a(y), k) < 4(k + 1) (4.15.13) 
and Eq. (4.15.12) follows from the definition given by Eq. (4.15.2). 


Case £ > 1. The entire proof will be based on the possibility of estimating 
the volume |E£| of a set E in terms of the area |OE| of its boundary OE. If 
E C Rf is a bounded set its volume |E| cannot exceed the volume of the 
sphere with surface area equal to the surface area |OE| of E (“isoperimetric 
inequality” ). So an inequality of the type 


|E| < Cy |OE| (4.15.14) 


holds Cy being a suitable E-independent constant. However, on T*, such an 
inequality is false for sets which “wrap around T”” (e.g., if E = T°, |E| = 
(27)*, |JOE| = 0 as OE = ); but of course, it is still true for sets with small 
enough diameter. 

To apply isoperimetric inequalities in T°, it is therefore useful to think of 
T“ as the union of many small sets. We shall regard T° as a union of 2° cubes 
with side 7 parameterized by an index ø: 


Co = {p| p E€ T, noi < pi <T (o,4+1),i=1,...,4 (4.15.15) 
where each c; takes the value 0 or 1. We call X the set of the 2 o’s. 


Given (ag,...,a%) € {0,...,s}* and (00,...,0%-1) € XF, consider the 
sets 


4.15 Complexity and Entropy 357 


0...k-1 
E (° sy ) =E St, Fa, 1... S_(k—-1)t1 Fari 
pene (4.15.16) 


0...k-1 
B E T —) =C N S-t Cay RETTA S_ (k—1)tı Con_1 


Since the rotations of the torus are “rigid transformations”, i.e., they do not 
change the form and volume of the sets that they transform, it will be possible 
to infer that the sum of the surfaces of the sets E N B, with E, B like Eq. 
(4.15.16) with the same value of k, is such that 


D eek) 1B (ack) 26+, (4.15.17) 


where L = X;-o |0E;| + 2°(2éx*-1). This simple relation follows from the 
geometric observation that 


...k-1 ...k-1 
eas aca) 
: Q0 - . . Q@k—1 00.--Ok-1 
FOr FR 
Me (4.15.18) 
=T U [S ht (Ea, ) U Shtai (3Con)], 
h=0 
and the right-hand-side points are counted twice in the left-hand side except 
for a subset of total area zero corresponding to the edges and corners of the 


sets B( Deke! ) NB ( Oae I We can now use Eq. (4.15.14) to bound 


Q0-..-Œk—1 0O0.-.-0k—1 


one el i PS 
p Tka la(y) a _ 0---Qk—1 
Qo- - . Qk—1 Hs) (27)é (27)! 
0... k—1 0...k—1 
Se | e o Aas Ok .)| 
z 
O0; 0L (27) 
0...k—1 0...k—1 TT oo) 
ca, y Pelee) 
< Ce 
O0;---;08@ (27)" 
PE EEN 
< Ce — , 
P2 (27) 
having used the rational independence of (w1, ..., we, at) in the first step 


[applying Proposition 30, Eq. (4.14.5)], and in the last step the inequality 
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(a+ 8)? > a” +8”, VY x >1,Va, 6 > 0, has also been used. The isoperimetric 
inequality has been used in the intermediate step. 

Equations (4.15.19) and (4.15.17) will now be used to estimate the total 
frequency of the strings of length k in a(w~) having “small probability” and, 
“precisely” such that given 7 > 0, 


ee es 
p & ato) < eon, (4.15.20) 
QQ0.--Ak—-1 


Recalling the ideas involved in the proof of the Chebyscéev inequality, Propo- 
sition 34, p.119, and if the label * indicates that the sum is restricted to 


p (EA fate) <e 


0...k-1 ta) 
< | 
= ee. f k Cs Qk—1 )) 


no matter how y > 0 is chosen. Then let y = 4, i.e., such that (l—y)74 = 1 
and deduce from Eqs. (4.15.21), (4.15.19), and (4.15.17) that the total prob- 
ability that Eq. (4.15.20) holds is bounded by 


DT y E a 
< (2r) eT T Ak +1)L 


Hence, given € > 0 and 7 > 0, Eq. (4.15.22) shows that if k is so large that 
the right-hand side of Eq. (4.15.20) is smaller than £, we can find, among the 
sets Ce appearing in Eq. (4.15.3), a set 


C.(n) = {set of the k-tuples a1,...a%, verifying Eq. (4.15.20)}. (4.15.23) 


Since [see Eq. (4.15.22) and Observation (1) to Definition 19, p.348] it is 


0...k-1 
re rae p Cae 
C-(7) cannot contain more than e”® elements because it consists of sets with 
probability > e—”". One then finds that 


a(‘p)) = 1 it becomes clear that the complement of 


N (e, alp), k) < e™ (4.15.24) 
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if k is large enough. Hence, 


S(a(y),e) < n (4.15.25) 


and Eq. (4.15.11) follows from the arbitrariness of n. 

So far the analytic regularity of E has been only used to deduce the first of 
Eqs. (4.15.19) which, as remarked elsewhere (see Observation (1), to Proposi- 
tion 30, p.342), follows simply from the Riemann measurability of E1, ... , Es. 
However, in the general case when w1,..., wg, a are not rationally indepen- 
dent, as assumed above, analytic regularity has to be used again to reduce 


the general case to the above-treated rationally independent case. mbe 
The above propositions provide a further non integrability criterion. 


(v) If in W there is one € whose (€,t) history on an analytically regular 
partition € of W has positive entropy, then the system is not analytically 
integrable on W. 


This criterion can be added to those listed at the beginning of §4.15, p.353 
and to the other criteria, also quite remarkable, that emerge from the problems 
at the end of this section, see problems 13-20. We now quote, without proof, 
some results on entropy theory and non integrable systems showing that in 
fact the previously stated non integrability criteria (i), (iii), (iv), and (v) are 
not empty of content [(ii) has already been discussed in §4.13, Observation (3) 
Proposition 27, p.336], i.e. the propositions below illustrate other properties 
of entropy (Proposition 37) or they show that there actually are systems 
whose non integrability could be decided on the basis of the above criteria 
(Proposition 38). 


37 Proposition. Let a = (a;)iez,0, ai = 0,1,...,p—1 be an ergodic se- 
quence. 
(i) The entropy of a can be computed as 


a) logp (2 Not 


‘ 1 0... N—1 
S(a) = Nim N X, p oe a) . (4.15.26) 
(ii) Given € > 0, there exists Ne such that VN > N, the p strings 
( Cen ) of history with length N, a priori possible, can be divided into 


ao-.-AN-1 


classes C}(N) and CZ°"°(N) such that 


Z a e <e (4.15.27) 


Q0,- Ny —1ECTOre ag.--AN-1 
i a E 


e“S@+ON <p & af Na. a) < e7 (S(a)-e)N (4.15.28) 
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(iii) The number of elements in C}(N) is such that 


eTEN < clin) < eTEN (4.15.29) 


Observations. 

(1) This is the “Shannon-McMillan theorem”, [25]. 

(2) Equation (4.15.26) is very useful because it sometimes allows the explicit 
calculation of S(a). The statement (ii) tells us that if N is large the number 
of strings of a that are “really important” is measured by S(a). Furthermore, 
such strings have about the same probability of appearance, and their number 
is therefore estimated by Eq. (4.15.29). 

In other words, one can think that in a rough (and weak) sense, see Eqs. 
(4.15.27) and (4.15.28), a consists of strings of large length each appearing 
“almost” equally probable (i.e., “almost” equally often) in a.If a is not ergodic, 
this last statement is not generally true: this is one of the reasons why the 
ergodic sequences are interesting. 


The following proposition (Hopf-Anosov-Sinai theorem, see [4]) gives an 
example of an analytic Hamiltonian system which is not analytically inte- 
grable. 


38 Proposition. Let X C R? be an analytic surface, bounded and with neg- 
ative curvature. The geodesic motion on X (i.e., the motion of a unit mass 
ideally constrained to X) is not analytically integrable because for every ana- 
lytically regular partition E of its phase space there exists a dense set of data 
whose (E,t) history, t = (jti)jez, is mixing and also has positive entropy. 


These last two theorems are two important examples of “ergodic theory” 
problems. This is a young theory; nevertheless, it is already rich in interesting 
results and, even more, interesting open problems. 


4.15.1 Exercises and Problems 


Can one build sequences of preassigned distribution? See Problems 1-12 below. 


1. Find examples of sequences a of symbols a; = +1 with non definite frequencies (Hint: 
For instance 10 symbols 1 followed by 102" symbols —1, followed by 102” symbols +1, etc.) 


2. Consider the sequence of symbols a; = +1: 


a = (1,—1,1,1,-1,—-1,1,1,1,-1,-1,-1,...). 
Show that it has well-defined frequencies and that p (3 la) = 4, Pp (97 la) =i, 


3. Show that the sequence in Problem 2 is non ergodic (Hint: Show that Eq. (4.14.35) is 
false for jı = 0,41 = j,a1 = 81 = 0.) 


4. Find an example of a subset A C J such that setting Eo = A, E1 = T*/A, there is 
in 7! a point p whose history on the partition E = (Eo, E1) with respect to the rotation 
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p — p+w mod 27, supposed irrational, does not have well-defined frequencies. (Hint: Let 
a be a sequence of 0’s and 1’s without well-defined frequencies, see Problem 1; then given 
p, let A = Uk ap=0 (P + kw).) 


5. Using Proposition 28, §4.13, p.339, and the method of proof of Proposition 30, §4.14, 
p.342, show that if E C T® is Riemann measurable, then every point of 7! evolving under 
an irrational rotation transformation visits Æ with well-defined frequencies. 


6. Let E = {0,1}, po = 4, pı = $, and consider the probability distributions (€, p) and 
(€,p), see Definition 20, §2.23, p.118. Let An (0) C EN be the sequences ag,...,aN_—1, 


1 
aj; = 0,1, in which the symbol 0 appears with frequency closer to 4 than N78, i.e. 


anO = fanan E(% 0 a)) a ne 
= 


Show that the probability of Ay (0) in (£, p)” is such that p(Ay(0)) > 1 — —. (Hint: 
8N 


4 
Use Chebyséev’s inequality, Proposition 34, p.119; see, also, Proposition 33, p.119.) 


7. In the context of Problem 6, regard Ay (0) as a subset Ay (0) of the space of the infinite 
sequences a = (a9, a1,...) of 0’s and 1’s defined by a € An (0)—> (ao,-..,an-1) E An (0). 
Show that the sets A,2(0) have the finite intersection property, i.e., 7_, A,2(0) 40,Vq > 1 
(Hint: Use Problem (6) to note that if ‘Ai, Ao, Aa, sacs Anz are all regarded as subsets in 
EF?” in a natural way, they have a probability in Ek, p(A;,2(0)) > 1- =e Hence, the 


complement of the intersection of any number of the A;,2’s has a probability such that 


p((NAx2(0))°) < S> p(Aga(0)°) < > Beas. 
k=0 


since (NEq)° C UES, in general. Hence, MA;,2(0) cannot be empty.) 


8. Extend Problem 6 to show that for every given string (o1,...,@s) or 0’s and 1’s, the 
set An(o1,...,0s) C (E,p)™ consisting of the strings a = (ao,...,an—1) € EN in which 
the string (o1,,...,@s) appears somewhere, with a frequency differing from 2~* by at most 


1 
N` 3 is such that 


E 
p(An(01,...,08)) > 1- 4 
N4 
for some €s. (Hint: Proceed as in Problem 6, observing that Ayn(01,...,0s) is the set 


fao, ana |A ( DX- 01)%(aj = 02)? (ajta = 0)?) = E < ar D) 


9. Extend Problem 7 as follows: regard Ay(01,...,0s) as a subset Ay(o1,.-.,¢8) in 
the space of the infinite sequences a of 0’s and 1’s defined by a € An(a1,...,05)<> 
(ao,.--@n—1) E An(o1,...,0s). Show that JNs, s = 1,2,..., such that Y n,q > 1: 

def 


© 0,1 q 
Bn,q oa Mesi Net Nog..-0s AN 42 (oo Secs Os) # 0. 
(Hint: See the hint to Problem 7 to estimate the probability or the complement of B. One 
now finds the condition: X? D: Wher Zi) 


10. In the context of Problem 9, show that if Nn, Bn,4 # Ø andaeé An,qBn,q, then a has 
well-defined frequencies and: 


©p( oan la) =2-%, YN, V a1...aN; 


a0-.--QN—1 
(ii) a is ergodic and mixing; 
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(iii) Saps(a) = log 2, S(a) = log 2. 

(Hint: For (ii), check the mixing directly; for (iii), apply, with patience, Definition 21, p.354.) 
11. Show that Nn,qBn,q in Problem 9 is nonempty. (Hint: Enumerate, from 1 to oo, the sets 
Ay, +2(01---0s) and denote them as D1, D2,.... Then, by Problem 9, LD; #0, Ym. 
Let am € Nj-1D3- Since the sequences ag have only two possible entries at each site, there 
must exist a subsequence ag,, qi > +00, and a aoo such that ag, eventually coincides with 
Aco on any finite number of sites: aco € Nj D;.) 


12. Extend Problems 6-11 to the case E = {{0,1},po > 0,p1 > 0,po + pı = 1, po Æ 
Show that there are sequences of 0’s and 1’s such that Sabs (a) = log2, S(a) = —po log po 
pi logp1 < Aass(a). 


i 
mS 


Other necessary integrability criteria emerge from the following series of 
problems together with other remarkable properties of integrable systems. 


13. Let Ai,...,Ag be £ prime integrals for an ¢-degrees-of-freedom Hamiltonian system on 
W CRY, or W C RExT* or W C REx (Ro x T®), £1 +2 = £, open. Call A(W) the set 
of the values of (Ai,..., Ag) on W: A(W) C Rf. Suppose that the equation A(p, q) = a can 
be inverted with nonzero Jacobian near po,qo,ao p = @(a,q) so that A(a(a,q),q) =a. 
Define the £ x £ matrices: 


Mij = » Nij = 
Study the “Hamilton-Jacobi” equations: 
Os Os 
ðq ðq 


and find conditions “guaranteeing their solubility” near qo, ao. Check that conditions could 
be {A;, Aj} = 0, y a9 = T oe 238 i.e., 


A(—,q) =a, i.e. = a(a,q) 


£ 


> - aA; dA; 2 


£ \ Ops 04s Oqs ps 


(see, also, Definition 19, §3.12). (Hint: It is only needed that the differential form a - dq 

da, _ 90; 
> Əðqj Odi 
differentiation rule, it follows from A(a(a, q),q) = a that 


£ 

OA; Oas (= = ) OA; 

=b75 5 and t =0; 
2 Ops ðaj  ” 2 Ops qj j 


i.e., with the above notations, NR = 1 and MT + N = 0. So, since T = —M~—1N, the 
integrability condition becomes 


be exact i.e. 


or Tij = Tji. By the implicit function theorem and by the chain 


M-1N =(M71N)? = NT(MTHT NMT = MNT 


because the Jacobian determinant det M # 0. The last expression once written explicitly, 
yields the result (“Liouville’s theorem” ).) 


14. Show the following properties of the Poisson bracket see Definition 19, §3.12: 
{F, G} = —{G, F}, 
{F, GL} = {F,G}L + {F, L}G, 
{F, {G, L}} + {G, {L, F}} + {L, {F,G}} = 0. 
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Two observables on phase space F,G are said to be “in involution” if {F, G} = 0. 


15. In the context of Problems 13 and 14, suppose that A1,..., Ag are @ prime integrals in 
involution. Consider the completely canonical transformation C generated by the function 
ee = asa, Denote it (a, K) = C (p, q). 
Show that H(C~'(a,«)) = h(a) is k independent. (Hint: Since the A’s are prime integrals 
(A = a(p,q)) and the map (p, q)— (a, K) is completely canonical, it must be that 


(a,q) — s(a,q) in Problem 13 (via k = 


ies ƏH (C7! (a, K)) 


=0. 
OK 


i.e., H(C—!(a, «)) is k independent.) 


16. Using the fact that the completely canonical transformations preserve the Poisson 
brackets, see Observation (2), p.237, to Corollary 25, §3.12, show that a necessary condition 
for the canonical integrability of a Hamiltonian system on a region W of phase space is the 
existence in W of £ independent prime integrals in involution. 


17. Show that a necessary and sufficient condition in order that A € C™(W) be a prime 
integral for a regular Hamiltonian system on W is that {A, H} = 0, if H is the Hamiltonian 
function. More generally, if S:(p,q), t € J, denotes a solution to the Hamilton equations in 
W and F € C®(W), show that 


< F(S(p.@)) = {H,F}(S(P,a)), YEEJ 


Here W C R” or R! x T! or R? x (R| x T®2), 41 + bo = £ is open. (Hint: Just compute 
the derivative of F using the Hamilton equations, to express p,q, and the definition of the 
Poisson bracket.) 


18. Let W be as in the above problems and let H € C™(W) be a regular Hamiltonian 
function. Assume that H is integrable on W and let J be the integrating transformation 
I: WV xT’, V C Rf, let (A, p) = I(p,q) and denote w(A) the pulsations of the 
quasi-periodic motions on the torus {A} x T°. We say that the system is “non isochronous” 
in W if the matrix J;;(A) = xvi (A) has a non vanishing determinant. 


= JA; 
Show that any prime integral B € C%(W) for a non isochronous integrable Hamiltonian 
system must be a function of Aı,...,Aọ introduced above. (Hint: Let B = b(A, p) be a 


prime integral in the (A, p) variables. It must be b(A, p) = b(A, p+w(A)t), Yt € R. If the 
components of w(A) are rationally independent the points p + w(A)t, t € R densely cover 
T®; hence, for such A’s, B must depend only on A and not on gy. However, if det J Æ 0, 
the set of A’s in V such that w(A) has rationally independent coordinates is dense in V 
(see Problems 9 and 15, §5.10, p.477 and 478). Hence, B must always depend only on A.) 


19. There is a theorem by Arnold concerning the case when W is an invariant open bounded 
set for a regular Hamiltonian flow generated by H € C™(W) and on W one can define £ 
independent prime integrals A = (Ai1,..., Ag) in involution (see Problem 14), with Ay = H 
and such that the sets A(p,q) = a are, for a € A1(W), regular closed bounded and 
connected surfaces in W. Then H is integrable on W. 

Are there systems integrable but not canonically integrable? (Answer: If £ = 2, some partial 
results are known ([43]; another partial answer is in Problem 22 below). The proof of 
Arnold’s theorem can be found on page 269 of [1]). 


20. Find an example of a Hamiltonian system whose motions are all quasi-periodic but 
which is not integrable. (Hint: consider two point masses free on a circle and on a line, 
respectively; let their positions be determined by (y1,q2) E€ T! x R if yi is the angular 


2 
position of the first particle and q2 the position of the second. Let H(pi,p2, "1, q2) = A.) 
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21. Suppose that in the region W an analytic Hamiltonian H(p,q) admits £ independent 
prime integrals A = (Aj,...,Ag) and H = Aj. Suppose that the surfaces A = a are 
tori of dimension £. Write their parametric equations as p = P(a, 4), Q(a, 4) and suppose 
that the evolution is p > p+ w(A)t: i.e. suppose that all motions are quasi periodic. 
If det out) # 0, i.e. if the system is anisochronous, then {A;,A;} = 0,Vi,j = 1,...,2. 
(Hint: Suppose that {A;,A;} Æ 0 then evolve an initial datum (a, 4) with the Hamilton 
equations with Hamiltonian A; for a small time £ and then with the Hamiltonian A, = H 
for a long time t. Since A; and H have zero Poisson bracket the two evolutions “commute” 
and the final datum has to be the same as the one obtained by first evolving (a, œ) for a 
time t with the Hamilton equations for H and then for a time £e with Aj. In the first case 
the result will be a datum (a’, p’ + w’t) with a’, ~’, w’ close O(e) to (a, p, w); in the second 
case the result will be (a”, p” + wt) with a”, p” close to a, p within O(c). However since 
{Aj, Aj} #0 it is w A w and this is a contradiction for t large.) 


22. Check that combining Problems 21 and 19 above a new criterion of completely canonical 
integrability follows. 


23. Let k — f(k) be a function defined for k = 1,2,... such that 0 < f(k), f(k +h) < 
f(k) + f(A), for all h, k = 1,2,.... Show that 
IE) 5g FR) ae, 


lim — =in 
ko+o k k k 


and apply this result to prove the existence of the limit (4.15.2) by showing that f(k) = 
log Nabs(k,a) has the above subadditivity properties. (Hint: Let € > 0 and let ke be such 
that s < Ef lke) < s +£; write k = hk: +p with h =0,1,... and p = 0, 1,..., ke — 1 and 


note that s < k1 f(k) < (hke +p)“(hf (ke) + f(p)) -qoy he Elke) < s +e.) 
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Stability Properties for Dissipative and 
Conservative Systems 


5.1 A Mathematical Model for the Illustration of Some 
Properties of Dissipative Systems 


In various possible senses, the stability properties of motions are more easily 
analyzed in systems moving in the presence of friction, as already noted in 
Chapter 2. 

Therefore, we shall mainly concentrate our attention on such systems, 
studying some stability questions selected among others because they seem 
particularly significant for the generality of the methods used to treat them. 

Similar questions will later be asked about conservative systems. However, 
the answers, when known, will be much harder to obtain. 

The gyroscope is, in some sense, the prototype for systems with many 
degrees of freedom. In fact, general systems of linear oscillators trivially reduce 
to systems of independent one dimensional oscillators, as explained §4.1-4.4 in 
the conservative cases; this remains true even in the presence of linear friction. 

On the other hand, the gyroscope with friction, or even some of its partic- 
ular cases, already presents many of the possibilities and difficulties that can 
be met in more complex systems. 

For this reason, in the upcoming sections, we shall illustrate the general 
theory through the treatment of a single example, described below and drawn 
from the gyroscope theory, which will be used to motivate the successive 
steps of a theory and of a method of analysis which, as will become evident, 
is applicable to many other dissipative systems as well. 

The example is given by Eqs. (5.1.18) and (5.1.19) and this section is 
devoted to their gyroscopic interpretation. 
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We consider a rigid body consisting of N masses, m1, M2, ... my > 0, with 
a fixed point O (all the constraints being ideal) immersed in a viscous fluid 
opposing to the motion a frictional force at the i-th point: 


—A\m;x (5.1.1) 


The moment of the frictional force with respect to 0 is then given by 


N N 
-AX mi(P)-O) AX = -rS mi (P,-O) A (wA (P; -0)) = Mw (5.1.2) 
i=1 i=1 
with the notations of §4.11. 
The second cardinal equation,! implies 


Iw =—XIw—w Iw, (5.1.3) 


where w is the vector whose components in a co-moving frame (O; i1, ig, is) 
are the derivatives of the corresponding components of w in the same frame. 
Equation (5.1.3) extends Eq. (4.11.31) to the case when the moment of the 
external forces is -AJw instead of 0. 

Assume that the co-moving frame has been fixed once and for all so that 
the inertia matrix I is diagonal, see Eq. (4.11.9), p.308, with elements 


h, b, Iz. (5.1.4) 


In order to obtain nontrivial motions, it will be convenient to imagine that 
the system is subject to the action of other forces having a moment M with 
respect to O. Otherwise, as is intuitively clear and as we shall shortly see, 
the system will just stop. The simplest force laws are those with moment M 
having constant components on the axes of (O; i4, is, is): 


M = Riis + Rois + Reis (5.1.5) 
or those with moment components in (O;ij, ie, i3) dependent only upon the 
angular velocity 

M'(w) = Ri (w)ii + R’(w)2(w)is + R3(w)is (5.1.6) 
which can be imagined (as in the examples below) generated by some “inner 
mechanisms” regulating their action as a function of the motion of the body. 

In the presence of forces with moments of Eqs. (5.1.5) and (5.1.6) added 
to the friction forces, the equations of motion of the system would become 
Ia = =w ^ Iw—d\Iw+M+M'(w). (5.1.7) 


Even in the simplest situations, e.g.if 


1t Ko =—Alw. 


5.1 A Mathematical Model 367 


M = Ris, M’ (w) = linear function of w (5.1.8) 


it could a priori happen that the differential equation E. (5.1.7) admits solu- 
tions t > S;(@), with suitable initial datum w, diverging as t > +00.? 

We wish to avoid having to deal with such phenomena, too idealized from 
a physical point of view, since it is clear that any real system “breaks down 
into pieces” if w reaches too large a value, when the centrifugal forces exceed 
the materials resistance. This is done by supposing that the friction coefficient 
has some extra dependence on w. For instance, 


Alw) = (Ay + Aqw?), Alw) = (Ar + Agw? + Awa + Now?) (5.1.9) 


which is a special case of the more general and realistic friction model in which 
Eq. (5.1.1) is replaced by —\p; (1 + (& - L;x)) &, Aj, pi > O and L; are 
3 x 3 positive-definite matrices. 

Summarizing the above discussion, the mechanical system whose proper- 
ties we wish to analyze will be described by the equation 


Iw = —w A^ Iw — Aww +M + M' (w), (5.1.10) 


where A(w) is given by Eq. (5.1.9) and M’(w) is a linear function of w. 

The above system is general enough to present a great variety of phenom- 
ena. For simplicity, we shall impose further restrictions, studying the following 
particular case of Eq. (5.1.10). 


(i) The rigid body is a gyroscope: = b, h = J. (5.1.11) 
(i) M= Ris, R>0. (5.1.12) 
(iii) M’(w) = QW i, + A2QWei2, ay =ag=a>0. (5.1.13) 


(iv) A(w) is given by the first or second of Eqs. (5.1.9),(5.1.13). 


It might be useful to have in mind a physical representation of the special 
system mathematically described by Eqs. (5.1.9)-(5.1.13): think of the body 
as consisting of six masses m located at the points to i;, ois, to’ i3. Then 


T=T=h=2m(e+e"), J= = 4m’. (5.1.14) 


The force given by Eq. (5.1.12) can be imagined to be generated by small 
“jet motors” located at the four points +ei;,+oei2 of the z3 = 0 plane, pro- 
ducing a thrust f identical at each site and perpendicular to the coordinate 
axis on which the site lies and parallel to the z3 = 0 plane. The moment of 
such forces is 


2 However, in this case, the global existence of the solutions, assuming A constant and 
M’ linear in w1,w2,w3, follows from an a priori estimate, for t E€ Rẹ}. Let R = Iw and 


multiply both sides of Eq. (5.1.7) scalarly by 9. It follows: taina < KQ? + K' for some 


K, K’ > 0, which implies (K Q(t)? + K’) < (KQ(0)? + K)e?**t yt > 0. 
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M = 40 fis (5.1.15) 


like Eq. (5.1.12) with R = 4of. The other force given by Eq. (5.1.13) is 
generated by small jet motors located at the two points on the axis i3, exerting 
a thrust along i; and ig, respectively, with intensities 


f'weit and f'wrie (5.1.16) 


and, therefore, their moment is 


f'o (wiii + wiz) (5.1.17) 


like Eq. (5.1.13) with a = f’ o'. 

The somewhat bizarre force given by Eq. (5.1.16) must be thought of as 
generated by jets producing a thrust proportional to the amount of air entering 
them per unit time, supposing them to be oriented as ig and i, respectively, 
and orthogonal to iş. The amount of air entering the jets per unit time is in 
this way proportional to w2z and w1, respectively. 

Obviously, if f’ 4 0, the gyroscope will tend to increase its rotation speed 
around the axes ij, ig, but not indefinitely: just as long as the system reaches 
a rotation speed causing so strong a friction as to compensate for the force of 
the motor (this is what actually happens if A4, A3, A% > 0). 

Explicitly writing Eq. (5.1.10) by components, given the assumptions of 
Eqs. (5.1.9)-(5.1.13), it is 


wy =— (A+ Agw”) Wy + aw — Wows 
w2 = — (Ai + Aqw?”) w2 + AW, + W1W3 (5.1.18) 
w3 = — (Ai + Aw”) w3 + R, 


with R, a, à1, A2 > 0, if the first of Eqs. (5.1.9) is assumed and if (J—J)/I = 1, 
a case to which one can reduce by the change of variables w; = wit; a, R 
are real numbers supposed positive, for definiteness. 

If the second of Eqs. (5.1.9) is assumed, then 


w = — (À + Agw? + Awa + NWF) w + aw — wows 
w2 =— (Art Aw} + AQw 2 + NZ ws) we + awe + ww (5.1.19) 
wg = — (Aq + Aw? + Awa + NY w?) w3 + R, 


my 


with R,a, 1,5, AZ, A3 > 0. Define àz = min(d,, NJ.A5) > 0. 
A symmetry in Eq. (5.1.18), absent in Eq. (5.1.19), feads to the elimination 


of one of the variables. In fact, if w? a w? + w3, we find, by multiplying the 


first of Eqs. (5.1.18) by wi, and the second by wz and adding them: 
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1 2 
n R (Ar + Agu? + Agw)w? + aw”, 
2 dt (5.1.20) 


dw 
T = (Ai T d2w”)ws3 + R, 


with \1, A2, R, œ > 0: much simpler as it involves only two unknowns, w?, ws.. 


5.2 Stationary Motions for a Dissipative Gyroscope 


Remark that Eqs. (5.1.18) and (5.1.19) admit global solutions in the future. 


1 Proposition. Equation (5.1.19) admits a solution t > S;(wo), t E R4, for 
every initial datum wo € RÈ. 
Furthermore, if Az = min(X,, AZ. ) and Q = (2£)3 + (ele 


(i) |Se(wo)| < lwo + 2, VtzZO (5.2.1) 
(|wo|? — 4:27) 


(i) |Si(wo)| <22, Vt> “So 


(5.2.2) 
Observations. 

(1) Equation (5.2.1) means that the trajectory of the motions of the w’s are 
bounded uniformly for t > 0. 

(2) Equation (5.2.2) means that all motions take place inside the ball with 
radius 22 after a finite transient time (which may depend upon the initial 
datum). 


PROOF. To show global existence, it suffices to show, on the basis of Definition 
3 and Proposition 5, $2.5, p.28, an a priori estimate, i.e., it suffices to show 
that if t + S;(wo) is a solution to Eq. (5.1.19) for t € [0,7] with datum wo, 
then it verifies the inequality (5.2.1), Vt € [0, T]. 
This is a simple consequence of the structure of Eq. (5.1.19). In fact, let 
w = S;(wo) and multiply the equations by w1, w2, w3, respectively; adding the 
results yields 
d1 


i = —\A(lw)w? +a (w? +. w2) + Rw < la — Alw? — Agqw4 + wsl. (5.2.3) 


Assuming the inequalities 


r À 
al > la — Ar |w?, Fo > R|w| (5.2.4) 


the right-hand side of Eq. (5.2.3) is negative. Hence, if initially |wo| > Q with 


def ,2R\3 2la — \y|\2 


the quantity |S;(wo)| = |w| must decrease as t grows, at least until it becomes 
< N. This implies both global existence and the estimate (5.2.1). 
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To find the estimate (5.2.2), note that |w| > 22 implies that the right hand 
side of Eq. (5.2.3) is smaller than —A22*. Hence, as long as |.$;(wo)| > 22, 
one must have 


[Selwo]? < [wo]? — 202244 (5.2.6) 
which means that for t > #42" it will be |$;(wo)| < 22. mbe 


In general, the simplest information about the nature of the motions de- 
scribed by a differential equation can be obtained through the study of sta- 
tionary solutions. 


2 Proposition. Equation (5.1.19) has, Y R > 0 and Ya > 0, a unique sta- 
tionary solution ©. This solution has © = @2 = 0, while w3 is the unique real 
solution to the equation 


(sb A, 02) + R=0. (5.2.7) 


PROOF. Setting w1 = w2 = 0 in the first two of Eqs. (5.1.19) and imagining? 
known A(@) and @3, one obtains two homogeneous linear equations for ©, ©2 
with determinant 


(a — A(B))? + 03 (5.2.8) 


which vanishes only for @3 = 0 and a = X(@), but the third of Eqs. (5.1.19) 
does not admit a stationary solution with @3 = 0. Hence, Eq. (5.2.8) does not 
vanish and, therefore, ©1 = ®2 = 0, which in turn implies that @3 has to verify 
Eq. (5.2.7). This equation admits just one solution by the strict monotonicity 
in w3 of the left-hand side. mbe 


A natural question is: how does the actual motion of the gyroscope look 
if the angular velocity is ©? 


3 Proposition. The motion of the gyroscope corresponding to the stationary 
solution © of Eq. (5.1.19) is a rotation with constant angular velocity G3 
around the axis i3, which remains fixed in space. 


PROOF. Let t — (O(t), y(t), Y(t)) be a description of the motion in terms of 
the Euler angles (6, p, Y) of (O;ii1i2,i3) with respect to the fixed reference 
frame (O,i,j, k). 

From Eqs. (4.11.12), (4.11.13), and (4.11.14),* p.309, one deduces the re- 
lationship between 6(t), (t), 1)(¢), and the vector w(t). In general, 


6 = wi cosy — we sin Y, (5.2.9) 


mt 


3 \(w) denotes 1 + ADW? + Aw? + Ay w2. 
4 Without the bars, since now there is no need of them. 
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w1 sin Y — we cos Y 


Gok NE 5.2.10 

es sin 0 ( ) 

: o 

wb = ws — "(uw siny + wa cosy). (5.2.11) 
sin 0 


Letting w2 = w2 = 0 and w3 = Ws, one deduces from Eq. (5.2.9) that 0 
is constant (6 = 0). Hence suppose, without loss of generality, to have fixed 
(O;i, j,k) so that 0(0) 4 0 or 7. 

The second equation, Eq. (5.2.10), impliesp = 0. Hence, ¢ is a constant. 
Since 6 and y determine the position of is in (O;i,j,k), it follows that is is 
fixed in (O;i,j,k) and, therefore, the system rotates around is (fixed) with 
angular velocity given by Ņ = @3, by Eq. (5.2.11). mbe 


We can now begin the study of non stationary motions. If a < A; the 
motions are particularly simple. 


4 Proposition. Ifa < àı the solutions t > S;(w) of Eq. (5.1.19) with initial 
datum w verify 


(Siw) — Bl < e(\w|) eT Or, (5.2.12) 


where c(x) is a suitable increasing function of x E R4. 

The corresponding motion of the gyroscope tends asymptotically to become a 

uniform rotation with angular velocity ©3 around the axis i3 which in turn 

tends to acquire a fixed position in (O,i,j,k), the fixed reference frame. 
More precisely, if t > (A(t), y(t), W(t)) is the description of the motion of 

the Euler angles, whose angular velocity is w(t) = S;(w), fort > 0, there exist 

constants tı > 0,C, > 0,0,9, Y, depending on the initial data and such that 


|a(t) — 8| <Cy ee“ 1 -)* 


= Ci —(Ai-a)t 
lp(t) — pl Sand € (5.2.13) 
C. 
[W(t) -Y — Bst| <- eTo 
inĝ 


For instance, Cı can be chosen as Ci = fete) see Eq. (5.2.12). 


PROOF. First check that Eq. (5.2.12) implies Eq. (5.2.13). In fact, Eqs. (5.2.9) 
and (5.2.12) imply that 6(¢) ——— 0 exponentially. Hence, we can define 


t—-+00 


t +o | 
O= lim 6(t)= lim (0(0) + f 6(r)dr) = 9(0) + f O(r)dr (5.2.14) 
0 0 
because the integral converges, see Eq. (5.2.13). Also, 


(A1-@) t 
—— (2 + |w)) (5.2.15) 


= too e 
= = a) ie 
ja) B= | f dar <2 


1 
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by Eqs. (5.2.9) and (5.2.12). E 
Possibly by rotating the fixed frame, suppose that 6 4 0,7. Then Eq. 
(5.2.10) implies that Y tends to zero exponentially since, as above, it is 


+00 
P = (0) +f ġ(T)dr, (5.2.16) 


= 2c(Q+ |w) Amat 
pep an 

Ip) — Fl < inf->,|sin@(T)| à- a 
which show the second of Eqs. (5.2.13). 


Similarly Eqs. (5.2.12) and (5.2.11) imply that w3 approaches @3 exponen- 
tially, as t — +oo. Hence, setting 


(5.2.17) 


=<. +00 . 
p = (0) +f (Y(T) — ©3) dr, (5.2.18) 


one finds, by Eqs. (5.2.11) and (5.2.12), for t > tı 


+00 . — 
|W) — Y — Bst| = |h(0) + j Y(T)dT — Y% — Wst| 


2c(Q+ |w|) e~Ai-a)t (5.2.19) 


inf->z, |sin@(7)| Ai —a@ 


+œ | 
=f E dr < 


proving Eq. (5.2.13). Naturally, the time tı has to be chosen so that inf;s;, 
| sin 0(r)| > | sin | > 0, say. 

To prove Eq. (5.2.12) remark that from Eq. (5.1.19), multiplying the first 
equation by w1, and the second by wə and adding the results, one finds 


d1 
dt 2 
w(t)? + we(t)?) < (w1(0)? + we(0)?)e7 201-9) t (5.2.21) 


(w? + w2) < —(A— a) (w? +02), hence (5.2.20) 


Furthermore, setting z = w3 — 3, the third of the Eqs. (5.1.19) becomes 
2 =ù — Arz — XY (w3 — 03) — (Aw? + AZ wo) ws 
=(— Ai — àw? — AYwe)z — XY (w3 + ws + w3) z (5.2.22) 


— Dz (AL w] + Agw). 


Since the general solution to the equation 
y=f(tju+g(t), t20, (5.2.23) 
is, Y f,g € C™(R), 


t t t 
y(t) = y(0) he firar +f jajek Fea? dr, (5.2.24) 
0 
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Eq (5.2.21) implies 
a(t) = 2(0) e7 So (Meet Mee) AE eh tan tok) dr 


0 

~ 5.2.25 

ee g, C 
0 


The functions which multiply 45, AJ, AY are nonnegative therefore 


t 
|z(t) < |z(0)|e7*" + Talal (w (0)? + w2(0)?) f oe 20A1-@)T eTA t=T) dr 
0 


<e Ci-o)t (|z(0)| 4 Jo 3] (w1 (0)? + We oen 


> Ae (5.2.26) 
<([@s| + |ws(0)| + ~——|@s|w(0)”), 
1 epee 0.4 
by Eq. (5.2.21) if Xo ““! max(,, AZ). 
Hence, Eq. (5.2.12) follows from Eqs. (5.2.26) and (5.2.21) with 
2 A A2 ~ n2 
c(x)? = (|@3| + £ + —— lsz’) (5.2.27) 
Al Cu 8 i 

mbe 


The analysis for a > Ai, is much more interesting and involves quite a few 
general ideas which will be discussed in the upcoming sections. The character 
of motion will change: for a >> A, it will be described, asymptotically for 
t — +00, by a behavior very different from the one seen so far, where the 
gyroscope sets itself in a state of uniform rotation around the axis is, fixed in 
space. 


5.2.1 Exercises 


1. Suppose that R = 0 in Eq. (5.1.18). Show that for a < A1, something analogous to the 
statement of Proposition 4 holds. 


2. Same as Problem 1, for Eq. (5.1.19). 


3. Show that for a > à, Eqs. (5.1.18) and (5.1.19) with R = 0 admit infinitely many 
stationary solutions, and find them. 


4. Consider a gyroscope like the one in Eq. (5.1.14), but assume that the friction is linear, 
that the two little jets arranged along the ig axis in —gig or +ọi2 produce a thrust in the 
direction i3 equal to f1i1, while the two jets on +o0'ig3 produce a constant thrust Rij. Show 
that the equations of motion become 


wi =— Aw, + w2w3 — 0 w3, 
w2 = — Aw — wiw +a, 
w3 = — w3 +0 w1 


N 


I i : 
7: Find the stationary 


for suitably chosen a,o and after a change of variables w; > wi 
solutions for the above equation. 
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5. Set w3 = &,w2 = z%,w1 = y and suppose that the friction is different for the different com- 
ponents of the angular velocity; i.e., suppose that the friction moment is (—A1w1, —Agwe, — 
l3w3). Study the same problem as in Problem 4 with Ai = 1,A2 = 6,A3 = ø, fixing 
b= š, o = 10 (“Lorenz model”). 


6. Find whether an analogue of Proposition 4 holds for the equations in Problems 4 and 5 
for some values of a. 


7. Find the stationary solutions for the equations 


Jı = — 271 + 49273 + 49495, 


42 = — 972 + 39173; 


43 = — 593 — T7172 + Q, 
y4 = — 574-7195; 
J5 == Y5 — 314. 


Using the same method of the proof of Proposition 4, for œ small, find a proof of the state- 
ment analogous to that appearing in Proposition 4, Eq. (5.2.12) (“five-mode approximation 
to the Navier-Stokes equations on 7?”). 


8. Same as Problem 7 for the equations 


J1 = — 271 + 4V 57293 + 4V 57475, 
42 = — 972 + 3V57173, 

43 = — 573 — TV 57192 — 97177 + a, 
4a = — 574 — V57175, 

45 = — ys — 3V 514 — 57176 

Ye = — V6 — 57175; 

V7 = — 57 — 9173, 


(“seven-mode truncation of the Navier-Stokes equations on 7?”). 


5.3 Attractors and Stability 


For a > 21, the motions of the model considered in 85.2 will exhibit a behavior 
qualitatively different from that seen for a < 1. It is therefore convenient to 
introduce some notions well suited to discuss various results in suggestive and 
agile language. 

The notions on stability and attractors that will be introduced can be sub- 
jected to the same critiques already presented in Chapter 2 when we intro- 
duced similar notions; i.e., they should not be taken too seriously as absolute 
definitions. Usually everyone, motivated by their own scopes, ideas, and needs, 
introduce their own definitions and it makes no sense to insist on a standard 
nomenclature, as much as it makes no sense to agree once and for all on the 
choice of the units of measure of the various physical entities. Here we shall 


5.3 Attractors and Stability 375 


choose some significant definitions and not discuss alternative definitions, re- 
calling that in applications the “correct” notions of stability and attractivity 
will be determined by the applications themselves. 

In this and in the following sections, autonomous differential equations in 
R? of the form 


x = f(x) (5.3.1) 


will be considered, supposing that the solutions have bounded trajectories, 
see Definition 3, §2.5, p.28, i.e., that the solution flow S; to Eq. (5.3.1) has 
the property that there exists a function u : R+ —> R4 such that 


|S,(w)| < (lul), Vt>0, Yu ER. (5.3.2) 


Proposition 1, §5.2, p.369, shows that Eq. (5.1.19) has this property with 
(jul) = |ul + 2. 
The first interesting notion is that of a stable set. 


1 Definition. Consider the flow Sı solving, fort > 0, a differential equation 
in RÌ like Eq. (5.3.1), with bounded trajectories. If A C RÌ, we denote S;(A) 
the set of the points u having the form u = S;(w) for some w € A. 

A set A will be called “invariant” for Eq. (5.3.1) [or for the motions of Eq. 
(5.3.1) or for its trajectories] if 


S(A)CA,  Vt>O0, (5.3.3) 


i.e., A is invariant if the trajectories originating in A develop, entirely, within 
A. If the inclusion in (5.3.3) holds also for t < 0, the set A will be called “bi- 
invariant”. 

An invariant, set A will be called “stable” for the evolution described by Eq. 
(5.3.1) if every neighborhood U of A contains a neighborhood V such that 


Si(V) C U, Vt > 0, (5.3.4) 
i.e., A is stable if motions starting sufficiently close to A do not go too far 
from it. 
Examples 
(1) The equation of the harmonic oscillator, 


t=-y, y=, (5.3.5) 


is an equation in R? such that every circle around the origin is invariant and stable. 

(2) Proposition 1, §5.2, relative to the gyroscope equation (5.1.19), provides another exam- 
ple. Equation (5.2.1) says that the ball with radius 22 is invariant. From Eq. (5.2.1), it also 
follows that it is stable. 


Another notion, closely related to the above, is that of attractor. 
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2 Definition. A closed set A C RÊ, invariant for the evolution associated 
to Eq. (5.8.1), is called an “attractor” for the motions of Eq. (5.3.1) if there 
exists an open set U D A such that 


jim d(Si(u), A) =0, Yue U, (5.3.6) 


where d(x, A) = (distance of x from A) and the set U is said to be a “partial 
basin of attraction” for A. 

The union of all the partial basins of attraction will be called the “attraction 
basin” of A and denoted as B(A). 

An attractor A is called minimal if it does not contain any proper subset which 
is also an attractor. 

A partial basin of attraction U for an attractor A will be called “normal” if 
for every u E€ U there is at least one point n(u) € A such that 


lim_ d(S,(u), $4(m(u))) = 0, (5.3.7) 


and the point 7(u) will be called a “projection” of u on A. 


Examples and Observations 

(1) The ball with radius 22, as well as that with radius 2, are attractors for 
Eq. (5.1.19). The first statement follows from Eq. (5.2.2), while the second 
can be deduced from the remark following Eq. (5.2.5) by slightly improving 
it (exercise). 

(2) For a < à the point @ is an attractor for Eq. (5.1.19) as is shown by Eq. 
(5.2.12). Its basin is all of RÌ, and it is a normal basin. Clearly, every basin 
of attraction for an attractor consisting of just one point is normal for it. 

(3) The unit circle is an attractor for the solutions of the equation in R?: 


1 1 
=ne E=, j= iay i; (5.3.8) 


In fact, by multiplying the first of Eqs. (5.3.8) by x the second by y and adding 
the results, 


darty? r? + y? 
d 2 2 
Setting 9 = x7 + y?, this becomes 6 = —ọ(o — 1), implying, if o(0) 4 0, 
olt) -1_ e0) -1 
o(t) 0(0) 


hence lim;_,4.. e(t) = 1 and the attraction basin for the unit circle consists 
of R?/{0}. The basin is normal because the point (x, y) Æ 0 has projection 


HG — 1) (5.3.9) 


(5.3.10) 


(5.3.11) 


may) = (Tae Tete) 
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on it and, in this case, the projection on the attractor is unique. 

As an exercise one can look at the trajectories of Eqs. (5.3.8) and at the 
geometrical meaning of Eq. (5.3.11). The unit circle is an attractor consisting 
of fixed points; it is also minimal. 

(4) In general, it is not true that an attractor is a stable set. 

To obtain some understanding of the mechanism (somewhat pathological, in 
fact) by which a point may be attractive without being stable, consider the 
unit circle S! in R? and let f € C%(S') be a function described as 0 — f (0), 
where 6 € [0,27] parameterizes a point on S1. Suppose that f(0) > 0,V@ € 
(0,27) and f(0) = 0 = f (27); then, by the Taylor expansion, one realizes that 
1/f(@) is not summable to either the right or to the left of 0. Consider the 
equation 


6 = f(0) (5.3.12) 


as an equation of motion of a point moving on S!, interpreting the angle 0, 
in Fig. 5.1, as the position. 


Figure 5.1: Illustration of remark (4) via Eq: (5.3.12). 


It appears immediately that since f(0) = 0, the point 6 = 0 is an equilib- 
rium position for Eq. (5.3.12). But if 0o > 0, then 5;(09) = A(t) increases with 
t, because f > 0 and f vanishes only for 0 = 0 or 0 = 27, and it takes an infi- 
nite amount of time to reach 27. This is so because the time needed to reach 
2r starting from ĝo < 27 is Je O] = +00 since f(9)~+ is not integrable. 
However, in a finite time, O(t) reaches any other position 0’ € (09,277) (as 

0 do 


6 FO < +00). Hence, 


i lim O(t) =27r, VOo E (0.27). (5.3.13) 
All circle points evolve counterclockwise towards 27, reaching it from the left, 
with the obvious exception of the points 69 = 0 and 69 = 27. Next, let f be 
an R?-valued function in C°(R?) which in a circular annulus U around the 
unit circle has the value 


f(x,y) = (— yf (0) - > (x? +y? — 1), ef (8) — (z? +y?—1)) (5.3.14) 


if (r,0) are the polar coordinates of (x, y). 
The equation x = f(x) associated with Eq. (5.3.14) can be written in polar 
coordinates and for (x,y) € U: 
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6 = f(8), on = —r*(r? — 1) (5.3.15) 
and the second relation shows that the set U is invariant and that the unit cir- 
cle is an attractor. The first of Eqs. (5.3.15) shows that the point 6 = 0,r = 1 
is a minimal attractor on the unit circle, which is unstable since arbitrarily 
close to it, there are points reaching it after going as far as ~ 2 away (e.g., the 
point r= 1,0 = £ > 0), i.e., after traveling a distance approximately equal to 
the circle diameter. 

(5) As the reader may guess, the problem of finding the basin of attraction of 
an attractor is a difficult problem. Very often it is only possible to determine 
some partial basins of attraction. The same remark applies to the determina- 
tion of the minimal attractors. 

In many applications, knowing partial domains of attraction or non minimal 
attractors is sufficient and the knowledge of such “global properties” as the 
maximal basins or the minimal attractors are not needed. 

(6) It is convenient not to require that a partial basin of attraction U for 
A be invariant. This may rightly be considered a natural requirement; note, 
however, that V = U Uso S¢(U) is an open invariant basin of attraction for 
A, i.e., any partial basin of attraction for A is contained inside an invariant 
partial basin of attraction. The total basin B(A) is obviously invariant. 

(7) If the differential equation (5.3.1) is also normal in the past, see p.28, it is 
possible to construct B(A) from a partial basin U for Aas B(A) = Uter S+(U). 


The question of the normality of a basin U for an attractor A is obviously 
quite important. For simplicity, assume A bi-invariant. 

Intuitively, the normality of U with respect to A depends on two factors: 
the speed of approach of the points of U to A and the speed of reciprocal 
separation of two points in A. One can expect that U is normal with respect 
to A if the speed of reciprocal separation of two points in A is much smaller 
than the speed of approach to A by the points of U. 

To make precise this intuitive idea, let us introduce some new concepts. 


3 Definition. Let U be a partial attraction basin for an attractor A for Eq. 
(5.3.1). We define the “attraction modulus” of A for U (or the “attractor 
strength”) as the function 


dy(t) = sup d(S,(u), A), (5.3.16) 


and dy(t) may be +00. Note that dy(t) decreases monotonically with t. 


Together with this notion, it is convenient to introduce another notion 
measuring how quickly two points on A can separate from each other. Note 
that from the regularity theorem for differential equations, see $2.4, it follows 
that if I’ is a bounded closed set, the quantity 
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(S(x) = Six) 
S S$ = ml 5.3.17 
ee [x — x’] ( ) ( ) 
xXx! 
is finite for all t > 0 and bounded on every finite interval [0,7], T > 0. It can 
be naturally called the “maximal expansion rate” for Eq. (5.3.1) relative to 


tE Rand tol C Rt. To this notion, the following definition is related. 


4 Definition. Let A be a bi-invariant attractor for Eq. (5.3.1) which is not a 
single point. The “uniform coefficient of maximal expansion” for Eq. (5.3.1) 
on A will be defined as the quantity 
S-(x) — S- (x’ 
M,(A) = sup Pom = sup m, (A), (5.3.18) 


x#x’cA |x= x']| TSt 
<t 


Note that M;(A) is monotonically increasing with t for t > 0. 


Observations. 

(1) The normality and boundedness assumptions on trajectories of Eq. (5.3.1), 

made at the beginning of this section, do not guarantee existence of global 

solutions in the past for all initial data. Hence, it is important to stress that 

in Eq. (5.3.18) A is bi-invariant and negative times are also involved. 

(2) Even if A is bounded, so that |S- (x)— S+ (x')| < {diameter of A} for all 7, 

the function M;(A) can increase very rapidly with t. A simple though rather 

trivial example is the following. Let f E€ C®(R) be such that 
1 

x£) =x if |z| < 5 


FC 
f(x) =— a(x? — 1) if ja| < 1. 


Then the interval [—1,1] is an attractor for the solutions of the differential 
equation z = f(x) and 


(5.3.19) 


M,({-1, 1]) 2 e’. (5.3.20) 


Eq. (5.3.20) follows by considering the evolutions of zp = 0 and zı = € 40. 
(3) By definition, M4(A) > 1. When A isa single point, we shall set M;(A) = 1. 
(4) If A is a periodic orbit with minimal period T > 0, then M;(A) is bounded 
in t and M;,(A) < Mr(A),Vt > 0. 


The following proposition makes quantitative the idea discussed above 
about the normality of an attraction basin U for an attractor A of Eq. (5.3.1). 
It provides a sufficient, though by no means necessary, condition for the nor- 
mality of a basin. 


5 Proposition. Let A be a bounded bi-invariant attractor for Eq. (5.3.1) and 
let U be an attraction basin for A. Assume the existence of C > 0,¢ >0 such 
that for all t > 0: 


5 For instance, the differential equation < = —$03 is normal in the future but not in the 


past; its solutions cannot be extended beyond to = —x(0)~?. 
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C 


Mi+1(A)*du(t) < G40" 


(5.3.21) 


Then U is normal for A. 
If A is a periodic trajectory, it is normal if there is a C~0 such that 


Cı 


du (t) < KETE 


(5.3.22) 


Observations. 

(1) Note that the statement concerning the periodic orbits is a consequence of 
the general statement. In fact, if A is a periodic orbit, it is clear that M;(A) 
is bounded, see Observation (4), to Definition 4 above. 

(2) Equation (5.3.21) implies the existence of a constant C2 such that 


m (A) < Ca,  Vreé[-1,1). (5.3.23) 


It also implies 


diameter of U < (2dy(0) + diameter of A) 


5.3.24 
< (2C + diameter of A) < +00 ( ) 


PROOF. Let tn = n,n = 0,1,..., and let x € U. Let a, € A be a point with 
minimal distance from S(x), among the points of A. The natural idea is that 
a projection a(x) of x can be defined as 


T(x) = im S_n(an)- (5.3.25) 


To prove the existence of the above limit, let us compare S_,(a,) with 
S_n—1(anti), assuming that A is not a single point (a case in which every- 
thing becomes trivial). Let U be the closure of U, bounded by Eq. (5.3.24). By 


the remark after Eq. (5.3.17), sup,¢o,1) Mmr (U) < u < +00. By Eq. (5.3.21), 


|S—n(an) — S—(n41)(Anti)| < Mn4i(A) |$1(an) — an+] 
< Mn+ (A — Sny (X)| + [Sn+1 (x) — anti) 
- $1Sn(x)| + dofn +1) 


(5.3.26) 


Hence, the series Jpg |S-n(an)— S—n—1(an41)| converges and, therefore, 
the limit of Eq. (5.3.25) exists. It also verifies 
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bee) — Sn(@n)| = |r(x) — ap = 32 (Sule) = S_e-1y(@n-a))I 
Vig ; (5.3.27) 
< C Mny (A)T (1 + mi (U)) 5 Rie aa 0. 
h=n+1 


We now compare S;,(7(x)) with S,,(x): 


[Sn (T(x) — Sn(x)| < [Sn (a(x) — an| + lan — Sn(X)| 

< |Sn(m(x)) — Sn(Son(an)) + dy(n) < Mn (A)r (x) — S-n(an)| + du (n) 

<C(ltm(U)) X ze a, (5.3.28) 
h=n+1 


Finally, if t = n +7,7 € (0,1), is large enough, 


(a(x) — Sla) = |S7Sn(x) = S7Sn(a(x))| 
< m-(U)|Sn(x) — Sn(m(x))| < u |Sn(x) — Sn(m(x))| 


because S;,(x) € U for n large enough. Since the right-hand side of Eq. (5.3.29) 
approaches zero, by Eq. (5.3.28) the proposition is proved. mbe 


(5.3.29) 


5.3.1 Exercises 


1. Investigate the normality of some basins of partial attraction for the attractors associated 
with the equation x = f(x) in R?: 


fæ) = (-y@? +9? -1) 


where ~ > 0 is a C™ function of its argument vanishing in 1 only. Show that the normality 
2) 02 
of the attractor is related to the convergence of the integral f a Tay near r = 1. 


2. An attractor may be minimal and non connected. Find an example. (Hint: Starting from 
Observation (4) to Definition 2, 377, improve the idea, i.e., take f(@) vanishing not only in 
0 and 27, but also in 7, positive elsewhere, so that the integral SIO diverges near 0 
and 7.) 


3. Consider a Hamiltonian system with £ degrees of freedom, integrable on some region 
W of its phase space. Show that each of the tori covering W is an invariant set for the 
Hamiltonian flow. Each is stable, but none are attractive. 


4. In the context of Problem 3, note that the invariant tori in W having pulsations w 
with rational components are covered by periodic orbits. Show that none of these orbits 
are stable if the Jacobian matrix Jij = (sat (A) has a non vanishing determinant in W. 
(Hint. As close as we wish to a given “rational” torus, there must be one with rationally 
independent components if det J # 0 (use the implicit function theorem, or see Problem 
15, 85.10, p.478). Every point on such “irrational torus” evolves covering it densely, and 
this implies instability because .... On the contrary if oe = 0, the periodic orbits, when 
existing, are stable.) 


5. Let A1, A2 be two attractors with partial basins U1, U2, respectively. Show that A, N A2 
is an attractor with partial basin U1 N U2, if A1 N A2 Æ Ô. 
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6. Show that if the set of the attractors contained in a given bounded attractor A is finite 
then there is a minimal attractor in A. 


7. Find an example of “an attractor without minimal attractors”. (Hint: Let f E€ C% (R) be 
everywhere positive for x < 0 except at the points xj = = j =1,2,..., where it vanishes 
(so that f O) does not converge near any of the xj). Suppose, also, that f(x) < 0 for 
x > 0. Then « = f(x) admits [a;,0] as attractors; however, it has no minimal attractors 


because {0} is not an attractor.) 


8. Show that the bi-invariance assumption is essential in proposition 5. (Hint: Consider 
t = —ax, A = [-1,1] and show that A is not normal.) 


9. Show that every bounded attractor A contains a bi-invariant attractor A. (Hint:A = 
Nt>052(A).) 


5.4 The Stability Criterion of Lyapunov 


Consider a differential equation, like Eq. (5.3.1), with bounded trajectories. A 
simple and useful criterion for the stability of one of its stationary solutions 
(“fixed points”) is the following proposition (“Lyapunov’s theorem” ). 


6 Proposition. Let xo be an equilibrium point for Eq. (5.3.1), x = f(x), with 
f € C~(R2): 


fx) = (FPE)... FOR) (5.4.1) 
and define the “stability matrix” (or “Lyapunov matrix”) 
df (xo) a 
Li = ———, ee ee 4.2 
J Ox a7 d (5 ) 


If the eigenvalues of L, i.e., the solutions of the d-th degree equation in A 


det(L — A) = 0 (5.4.3) 


(see Appendix E), have a negative real part, then xo is stable and is locally 
attractive with exponential strength.® 
If at least one of the eigenvalues has a positive real part, then xo is unstable. 


Observations. 

(1) More precisely, if all the eigenvalues \1,...,,Aq of L have a negative real 
part, there exists to > 0 (“halving time”) and g > 0 such that for |xo—w| < o, 
one has 


d(S;(w),x) <2-27-%|wl,  Vt>to. (5.4.4) 
(2) The reason why the above proposition is true and natural is made clear 
by the analysis of the “linear case”, i.e., by the analysis of Eq. (5.3.1) with 
6 


i.e. there is a small enough neighborhood U of x9 which is a partial basin of attraction 
for xo with an exponential strength of attraction, see Definition 3, p.378. 
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d 
FOR) = X Lijzj = (Lx)j. (5.4.5) 


In this case, Xo = 0 is a stationary point for the equation; the equation itself 
can now be written as 


x = Lx, (5.4.6) 


and its stability matrix is just L. As seen in the problems of §2.2-§2.6, one 
can look for d linearly independent solutions of Eq. (5.4.6) having the form 


x(t) = ety (5.4.7) 


Such a solution exists if there exists v 4 0 such that 


Lv =v (5.4.8) 


If we assume that the d-th degree algebraic equation for A, det(Z — A) = 0, 
has d pairwise distinct roots \1,...,q and if v™,...,v are the associated 
eigenvectors of Eq. (5.4.8), it is well known that v™,...,v are linearly 
independent (see Appendix E, p.523.) Then the function of t € R: 


d 
x(t) = 5 aj eò tyl) (5.4.9) 
j=1 


is, for every choice of a1,...,@q € C, a solution to Eq. (5.4.6). 

By the linear independence of the vectors v),...,v“, by suitably fixing 
the coefficients a1,...,@q one can impose that Eq. (5.4.9) verifies any preas- 
signed initial condition. Hence, Eq. (5.4.9) is the most general solution of Eq. 
(5.4.6). If Re å; < 0, i =1,...,d, it is clear that 


d 
Ole SN" Jajl |v), w20 (5.4.10) 
j=l 

where —v = max;=q,...,a Re A; < 0; hence, the origin is an attractor with basin 
RÅ itself. Every bounded sphere is attracted by the origin with exponential 
strength, by Eq. (5.4.10). 

If instead Reàı > 0 and Im), Æ 0, say, and if Ay = M, v® = vO 
(the bar denotes complex conjugation),’ it is clear that by (5.4.9), the initial 
datum e (v + -v@)) evolves into 


Deere! RE (vh eiT™ t), (5.4.11) 
T Since L is a real matrix its eigenvalues appear in complex-conjugate pairs or are real. 
Similarly, the eigenvectors can be chosen to be either real or appearing in complex- 
conjugate pairs corresponding to complex-conjugate eigenvalues. 
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Hence, arbitrarily close to the origin, there are points evolving indefinitely 
far away from the origin. Therefore, O not only does not attract, but it is 
unstable. 

The following proof will reduce the nonlinear case to the linear one. If 
Rer; < 0,7 = 1,2,...,d, one shows that if a point is close enough to the 
origin, then the nonlinear terms of f can initially be neglected for the purposes 
of studying the equation of the motion and, by the preceding argument, the 
point starts approaching O. Therefore, the nonlinear terms become even less 
important and, more and more precisely, the system will move as if it were 
subject to a linear equation. 

If Re Ay > 0, on the contrary, O cannot be stable because the initial 
datum e (v) + v) moves away from the origin, if € is small enough, at 
least as much as needed so that the nonlinear terms of the equation become 
sizeable. This suffices to exclude stability of the origin, even though it cannot 
exclude its attractivity (since the point could go far from O in the v®, v® 
plane (roughly)) and, then, under the influence of nonlinearity, it could come 
back towards 0 along a direction 7 where Re A; < 0, except, of course, when 
Re; > 0, for all i = 1,...,d. The reader will recognize the above ideas in 
the following proof. 


PROOF. Let Up be a radius R ball centered at the origin. Assuming that 


Rer; < 0, i = 1,...,d, we must determine oo so that the evolution t > S;(w) 
of an initial datum w € Us, develops, Vt > 0 in Up: S:(w) € Ur, Vt > 0. 
For simplicity, suppose that A1,...,Aq are pairwise distinct. The reader 


can think of the general case as a problem (basically, it is just an algebraic 
problem). Proceed as in the small oscillations theory of §2.14, Proposition 20, 
p.65, and write Eq. (5.3.1), assuming, without loss of generality, that xo = 0: 


x = Lx + (f(x) — Lx) = Lx + N(x) (5.4.12) 
where N is an R@valued C®(R4@d) function with a second-order zero in O. 
By Taylor’s theorem, see Appendix B, given R > 0, there is a constant CR 
such that 


IN(x)| < Cri|xl?, VxEUR (5.4.13) 


Consider Eq. (5.4.12) as an equation in which N(x(t)) is thought of as a 
known function of t, Vt > 0. Then a particular “solution” would be 


t d 
p(t) = f De Mag(N(x)(7)) vO dr (5.4.14) 


where, in general, given w E€ RÊ we shall set 


d 
w= X a;(w) vO), (5.4.15) 
j=1 
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Since v,...,v“ is a basis in C4, such a representation is possible and 
defines the coefficients a;(w) (which, in general, may be complex even for real 
w); and, furthermore, there is a constant A such that 


d 
X aj (w)| < A |w]. (5.4.16) 
j=l 
We shall suppose to have chosen the vectors v™ so that |v| = 1, i = 1,...,d, 


which implies that A > 1. Then the solution to Eq. (5.4.12), t > x(t), t > 0, 
with the initial datum w will be 


t d 
SDa vO + S DO A aN) VO dr 


(5.4.17) 
The boundedness assumption on the trajectories implies existence of u(R) < 
+oo such that |S;(w)| < u(R),Yt > 0,Y w € Ur. Then, setting, Vo < R, 


D,(t) = max |S+(w)], (5.4.18) 
|w|<o 


one deduces from Eqs. (5.4.17), (5.4.18), (5.4.16), and (5.4.13): 


AC 
Sew) < Alw] +A f e Cm Del)? dr < Ao + HED lt), 
0 


(5.4.19) 
where v = minj=1,....a|ReA1|. By the arbitrariness of t and by the monotonic- 
ity of D,(t), as a function of t, Eq. (5.4.19) means that 


ACR) 


Delt) < Ao + DN (5.4.20) 
: . ACR) . : 
i.e. if 4—“ 0 < 1, it must either be that 
14+,/1—4AC = 1 
BOs es SS (5.4.21) 
2AC RV 2AC (RY 


or, if K > 1 is a suitably chosen constant (R-dependent). 
1—./1—4AC,(Ryv~10 
D(t) < — > <K 5.4.22 
o( ) = 2AC (RV! = Q ( ) 
If |w| < 2o < R and oo is chosen so that 


1 


1 
ee 5.4.23 
O ITACU ge) 
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we see that Eq. (5.4.22) must hold for all t > 0, by continuity, since for t = 0, 


lw] = D,(0) < oo. (5.4.24) 


Hence for all w € Too, 


D(t) < K |w] (5.4.25) 
which implies that O is stable. 
Attractivity of O is obtained via the autonomy of the Eq. (5.4.12) or 


(5.3.1). If in fact there is a time to > 0 and a F < Qo, [choose here gp as given 
by Eq. (5.4.23) with R = 1, say], such that 


1 
ISw)| < Slwl, Yt 2 to, w € Up, (5.4.26) 


Then by the autonomy of the differential equation it is 


|Si(w)| < 27”|w], Vt > nto, w € Us, (5.4.27) 
as seen by iterating Eq. (5.4.26). Hence Eq. (5.4.27) implies 


[S:(w)| < 2-27 wl,  Vt> to, (5.4.28) 


because A is, in general, not an integer. It remains to check Eq. (5.4.26). The 
first of Eqs. (5.4.19), together with Eq. (5.4.25), implies 


AC AC a K70 
[Si(w)lAee™”t |w] + O K? w? < |w|(e™"tA + n, (5.4.29) 
V|w| < @ with g arbitrary provided g < oo. If g is chosen so small that 


AC a) KO < 4, it follows that 


1 
|S:(w)| < (eo + p” (5.4.30) 
and Eq. (5.4.26) follows by choosing to so that Ae~”’? = i, ie. to = + log 4A. 
The statement concerning the instability is left to the reader. 
mbe 


5.4.1 Exercises 


1. Compute the Lyapunov matrix for the stationary points of the equation t = a x(1 — x), 
a E€ R, and find for which values of a they are stable. 


2. Consider the pendulum differential equation on R?: & = y, y = —g sing. Find the 
stationary points and compute their Lyapunov matrices, identifying the unstable ones. Find 
the explicit values of the eigenvalues of the Lyapunov matrix relative to all the stationary 
points and find the stable ones. (Hint: Stability cannot be decided on the basis of the 
Lyapunov’s criterion; use energy conservation instead.) 
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3. Consider the Euler equations Eq. (4.11.32)-(4.11.34), p.312. Assume Iı < I2 < I3 and 
compute the Lyapunov matrix of the stationary solutions different from w = 0. Show that 
the only other stationary solutions are uniform rotations around either the i; axis, or the 
ig axis, or the i3 axis. Show that for solutions of this type the Lyapunov criterion does not 
exclude stability for the rotations around ij and i3. 


4. In the context of Problem 3, with Iı = Ig = I, I3 = J, make use of the integrability of 
the gyroscope to discuss the stability of the three uniform rotations. 


5. Suppose that the differential equation in R4, x = f(x), admits a prime integral A(x), 
ie. a function A € C°(R*) such that, Vx € R4, Vt > 0, it is A(S¢(x)) = A(x). Suppose 
that A has a strict minimum at xo E€ RÊ. Show that xo is a stable stationary point. 


6. Use Problem 5 and the conservation of energy to discuss the stationary rotations of the 
frictionless gyroscope (with I; < I2 < I3) and their stability properties along the following 
lines. First find the Deprit variables of the uniform stationary rotations (see §4.11, p.317 
and p.320) around the inertia axis ip, k = 1,2,3. (Answer: Kz, A, A, y, p, Y + wt for i3). 
Then, using the Deprit Hamiltonian as a prime integral and Problem 5, show that the 
rotation around the ig axis is stable if I3 > I2, I3. (Hint: Note tat the Deprit Hamiltonian 
can be written as 
poA aiy cos? h42 L?) 
213 2 h Io 


which has a minimum when A = L if and only if Iz > I2, I3.) 


7. If the differential equation x = f(x) on R4? is such that there exists a function A € 
C™(U), U C R4, which is monotonically non increasing along the motions (i.e. A(S+(x)) < 
A(x), Vt > 0, Vx € U, as long as S;(x) € U, V7 € [0, t], we shall say that A is a monotonic 
function for the given differential equation in the domain U. If A is monotonically decreasing 
we call it a “Lyapunov function” for the differential equation. Show that every point where 
a Lyapunov function has a strict minimum is a stable fixed point. 


8. In the context of Problem 7, and under the assumptions of Proposition 6, define 


Foo sta 
A(w) = i |Se(w)|?2%0 dt 


for |w| small enough, say, |w| < o. Show that: 

(i) A is well defined for all |w| < @ if @ is chosen as in Eq. (5.4.29). 

(ii) A e C® (Uz), where Uz = {w | |w] < d} 

(ii) A is a Lyapunov function in the sense of Problem 7. 

(iv) 2% A(St((w)) is monotonic in t > 0, Vw € U5. 

(v) A has a strict minimum at w = 0. This is the “second Lyapunov theorem” (on the 
existence of a Lyapunov function whenever a stationary point has a stability matrix with 
eigenvectors with negative real part). 


9. Compute the function A of Problem 8 for the linear equation x = Lx, supposing that 
all the eigenvalues of L are pairwise distinct and have a negative real part. Show that A 
5 log2 > Wee log 4A, with 


is a positive definite quadratic form in w. (Answer: If yo = 


2t 
A being the constant introduced in Eq. (5.4.16) and not to be confused with the quadratic 
form A that we wish to compute, it is A(w) = eer +A; +0)71ai(w)a,;(w), where 


ai(w) is defined as in Eq. (5.4.15).) 
10. In the context of Problem 9, show that the ellipsoid A(w) = a > 0 has in w an outer 
normal n(w) such that n(w)-Lw < 0. (Hint: Note that n(w) = SS (Ow denoting the 


gradient); furthermore, by Problem 9 (iii), the derivative of 2?*0 A(S;(w)) is non positive: 


388 5 Stability Properties for Dissipative and Conservative Systems 


£ = 
(A882 p Byam <0 oh dA > log 2 
2to dt 


pein A, 
dt 2to 


so if dA/dt = 0, it must be that A = 0, i.e., w = O because A is positive definite. However, 
dA/dt = Ow(w- Lw); hence, Ow(w- Lw) < 0 if w £0.) 


11. Show that the proof of Proposition 6 can be interpreted as saying that if Ag(w) denotes 
the Lyapunov function of the linear differential equation x = Lx, see Problems 9 and 10, 
and if A(w) is the Lyapunov function of the differential equation x = f(x) with the origin as 
a fixed point with Lyapunov matrix L, then, assuming that the real part of the eigenvalues 
of L is negative, A(w) = Ao(w) + O(|w|3). 


12. Consider a one-parameter family of differential equations in R: x = f(x,a), with 
xo = 0 being a stationary point for all values of a € (a,b) C R. Suppose that the Lyapunov 
matrix of 0, L(a) has pairwise distinct eigenvalues, all with real part < —v < 0, Va € (a,b). 
Let œo € (a,b) and let Aag be the Lyapunov function of Problem 11, relative to the 
equation x = f(x,ao). Show the existence of ô > 0,€ > 0 such that the neighborhood 
Vs gey {x| Aao (x) < ô} has an outer normal n(x) such that, Vx € OV; it is n(x)-f(x, œ) < 0, 
Va € [ao — €,a0 + €]. (Hint: First consider the linear case, then Problems 10 and 11). 


13. From Problem 12, deduce that Vs is invariant for the equation x = f(x,qa) for all 
a € [ao — €,a0 + £]. (Hint: Suppose the contrary and proceed per absurdum.) 


14. Consider a Hamiltonian differential equation in R?% associated with the Hamiltonian 
function H(p,q) = 4p?+V(q). Let (0, qo) be an equilibrium point. Show that its Lyapunov 
matrix has eigenvalues that can be collected into pairs of opposite value, either both real 
or both purely imaginary. Furthermore, show that this implies that its stability cannot be 
settled on the basis of the Lyapunov criterion, while its instability can sometimes be settled 
A -B 
on this basis. (Hint: Note that the Lyapunov matrix has the structure L = (a D ) 
where A, B,C, D are the d x d matrices 


8V 


B20, Bea , 
2 AqiOa; 


u 


So if L ( ) with u,v € R4%, it must be that Au + Bv = 0,u = Aw so that —\?v = Bv. 


v 
But B is symmetric so that its eigenvalues are real (see Appendix F), hence ...). 


15. Show that the Lyapunov matrix eigenvalues are invariant under regular changes of 
coordinates y = a(x). (Hint: If ø is defined in the vicinity of the stationary point xo E€ RI 
for x = f(x) and if Ji; (y) = a for y = o(x), is the Jacobian matrix of the nonsingular 
change of coordinates, (i.e., such that det J 4 0), then the differential equation becomes, 
in y coordinates, y = J(y)f(a~!(y)), and this implies that the Lyapunov matrix at yo = 
o—!(xo) is L! = J(yo)LJ(yo)~!; hence, det(L’ — A) = det(JLJ~! — A) = det(J(L — 
\)J~+) = det(L — d).) 


16. Let H be a Hamiltonian function describing in some local system of coordinates N 
point masses in R4 subject to conservative active forces and constrained by a bilateral 
ideal constraint to a surface X (in the sense of Chapter 3). 

Let (0, x0), Xo € X, be a stationary point. Give arguments (or prove) that the eigenvalues of 
the Lyapunov matrix for the Hamiltonian equations corresponding to the given stationary 
point appear in pairs of opposite eigenvalues either both real or both purely imaginary. This 
is a refinement of Problem 15 extending it to the case of a system ideally constrained to X. 
(Hint: In a system of local regular coordinates around xo and adapted to X, the Lagrangian 
takes the form [see Eq. (3.11.23), p.215]: £ = ape 9(B)ij BiB; — V(B), with g being 
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a C% positive-definite matrix function and with V also of class C°. So the Hamiltonian 
is [see Eq. (3.11.25), p.215] H = Foy GP) G Pipa + V(Q@). Hence, the matrix L is 


A -B 
with 
(c a) 
OPV 


Ah Bo see 
7 Aqidg 


where Gij = g(Bo)ij and Bo is the point representing xo in our system of coordinates. So if 


u u 
x ) =A J u,v € R£, this means Bv + Au = 0, G7 tu = Dv, i.e., (B + 2G)v = 0; 
v v 


hence, 0 = det(B + 2G) = det (B b A2VGVG) = det (VGV G BVG + 2) VG) = 


(det G) det(VG-1BVG—! + 2), see Appendix F for the definition of the square root of 
a positive-definite matrix). So, since VG~!BVG~—! is a symmetric matrix (because G is 
such, see Appendix F), it follows that \? is real, positive or negative, etc.). 


17. Show that Proposition 6 holds if the hypothesis of bounded trajectories is weakened 
into that of normality or even into no assumption at all; in the latter case, show that global 
solutions exist for t > 0 for initial data close enough to xo. (Hint: Simply carefully examine 


the proof of Proposition 6.) 


5.5 Application to the Model of 85.1. The Notion of 
Vague Attractivity of a Stationary Point 


In the case of Eq. (5.1.19), it is easy to compute the Lyapunov matrix relative 
to the stationary solution @: 


a— Ài = NY 052 — 03 0 
L= o3 a— ài — A03 0 ‘ (5.5.1) 
0 0 — À — 34703 
whose eigenvalues are 
(a — dy — A703) + is. (5.5.2) 


Hence, @ is stable and attractive for some of its neighborhoods not only if 
a < ; as already seen in §5.2 and §5.3, but also for Ay < a < Ai + X33, see 
Proposition 6, §5.4, p.382. The attractivity of © in this interval of variability 
of a is exponential near @: 


if aœa— à — 3X703 <0 then (5.5.3) 
|S:(w) — @| < 2-27 |w — Gl. (5.5.4) 


if |w — @| is small enough; to > 0 depends only on the matrix L [see §5.4, 
comment after Eq. (5.4.30)], and it can be estimated as inversely proportional 
to (Ai + 4703) — a. 

A discussion identical to the one developed in the case a < A; shows that 
Eq. (5.5.3) implies that every motion of the gyroscope associated with an 
evolution t > S;(w) like Eq. (5.5.3), for the angular velocity of the comoving 
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frame, asymptotically tends to become a uniform rotation around the is axis 
which, in turn, tends to a fixed position in space. 

The difference between the cases a < \1, and Ay < a < Ay + A903 lies in 
the fact that now we can no longer guarantee that © is a “global attractor”, 
i.e., with basin of attraction coinciding with R3. The criterion of Lyapunov 
has, in fact, only a local character, and thus it can only lead to the recognition 
of local stability, instability, or attractivity. 

Of course, it is of interest to investigate whether or not the attraction basin 
for © is all of R3, and if not, it would be important to understand where the 
other attractors for the equation are located. However, this analysis could not 
be done using general results such as the Lyapunov criterion and we shall 
not discuss this point in further detail, contenting ourselves with the local 
information found so far. In any event, it has to be stressed that these kinds 
of problems are very difficult and very little understood in general. 

The motion @ is no longer stable for a > Ay + A2@3, not even locally, by 
the second part of Proposition 6, §5.4. We then inquire about what happens 
to a solution of Eq. (5.1.19) following an initial datum w slightly different 
from ô and for a > ae = \1 + A2@3, at least for small a — ae. 

The first question is whether for a slightly larger than «œe, 


Oe = Ay + A703, (5.5.5) 


the motion of the data w close to © departs very much from the motion @. As 
we shall see, this question naturally leads to the following interesting notion 
of “vague attractivity”. 


5 Definition. Let (x,a) — f(x,a) be an R4-valued C®(R? x I) function 
with I = open interval, such that the differential equations 

x = f(x, a), (5.5.6) 
parameterized by a € I, have uniformly bounded trajectories® with respect to 
a € I and, furthermore, admit a stationary solution xo E€ R? such that 

f(xo, Q) = 0. (5.5.7) 


xq will be called “vaguely attractive” near œc € I if there is a neighborhood U 
of Xo such that for every ô > 0, one can find ts > 0, cs > 0, o5 > 0 such that 


SU) Cc rO), Vt ts, V a€ (ac — €5, We +5), 


= (5.5.8) 
St (1(6)) C T (o8), Vt > ts, V a € (Qe — E5, Qe + E5), 


with 05 -pẹ 0. Here sf) is the solution flow for Eq. (5.5.5) and I'(6) = cube 
with side 20 centered around xo. 


8 if gi denotes the flow generated by Eq. (5.5.5), this means that the bound on the 
trajectory of x € RI, [Sk (x)| < (|x|) holds and u is continuous and a-independent. 
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Observations. 

(1) In other words, Xo is vaguely attractive near a, if there is a neighborhood 
U which is a basin of attraction for an attractor containing x9 and having a 
diameter smaller than any arbitrarily prefixed length 6 > 0 for all a’s close 
enough to ac. Furthermore, this attractor, contained in (6), “uniformly at- 
tracts” the points of U and has a “weak stability”, as expressed more precisely 
by the first and second of Eqs. (5.5.7), respectively. 

Note that for a = ae, the point xp must be attractive for the points in U. 
In fact x9 is vaguely attractive for a near a, if and only if it is stable and 
attractive for Eq. (5.5.6) with a = ae. 

(2) One can also say that xo is vaguely attractive near a, if it is the attractor 
of a neighborhood U of xo, for a = œe while for a close to œe it still attracts 
the points of U not too close to x9. The “attractivity away from xo is uniform 
in a” near Qe. 

(3) If in a, the Lyapunov matrix L(a) for Eq. (5.5.5) relative to xo has eigen- 
values with a negative real part, it follows from the arguments of the proof of 
Proposition 6, §5.4, that xo is vaguely attractive near a,. Actually, the set U 
can be taken such that for some £ọ > 0 it is sou C U,V |a— ael < 60, Vt > 0, 
(this follows from the Problems 12 and 13, §5.4, p.388.) 

(4) Hence, the vague-attractivity notion is interesting only when L(a,) has 
some eigenvalues with a vanishing real part. 

(5) All the upcoming examples of vague attractivity will have the property 
that U can be chosen to fulfill Eq. (5.5.8) for all t > 0. It seems not impossible 
that the neighborhood U of vague attractivity could always be chosen in this 
way. 

(6) The condition that (6) be a cube with side 26 centered at xo could be 
equivalently replaced by the requirement that (6) be a family of neighbor- 
hoods of xo with diameter tending to zero with ô. 

(7) The assumption that xo should be a@ independent is only apparently more 
restrictive than the natural assumption of the existence of a stationary solu- 
tion x‘ depending in a C®-regular way on a. With a change of coordinates, 
one can always reduce the stability theory, of such a stationary point, to that 
relative to the case when x(® = 0. 

(8) If xo is replaced in the above definition by an invariant set A and I'4(d) = 
{set of the points at distance < ô from A}, one defines the notion of a “vague 
attractor”. 

This notion could be extended to the case when A depends on a, although 
not as straightforwardly and as unambiguously, as in the case A = {x‘)} 
discussed in Observation (7). 

(9) Last but not least, the vague attractivity of xo is a notion invariant under 
changes of coordinates; it is also invariant under changes of the equation itself 
(i.e., of the function f(x, a), for x outside some neighborhood of xo). Vague 
attractivity is an “intrinsic local property” of Eq. (5.5.5) near xo and ae. 
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The above facts play an important role in the formulation of simple vague 
attractivity criteria which show that it is a property that can be inferred from 
the knowledge of the x derivatives of f(x,a-) in xo of order not exceeding 
3. To illustrate this important fact and to provide, in this way, some simple 
vague-attractivity criteria, it is convenient to introduce the notion of “normal 
form” of a differential equation near a stationary solution. 


6 Definition. Let x = f(x,a), £ € C~(R4 x R), be a differential equation 
in RI with uniformly bounded trajectories (see footnote 8 to p.383) and with 
Xo = 0 as a stationary point Va € I = (a,b). 

Let L(a) be the stability matrix at xo = 0 and suppose that \1(@),A1(@),..., 
Ap(@), AV (@),---,AG(@) are 2p+q of its eigenvalues, the first 2p being arranged 
into complex-conjugate non real pairs and the last q being real. 

We say that the differential equation has, fora € I, a “normal form” with 
respect to the mentioned eigenvalues of L(a) if, writing the coordinates of x 
as (xD, x)... x y®,... y, z) with xD) E€ R?, j =1,...,p, y® E€ 
R,i=1,...,q, and z E€ RI?P-4, the equation has the form, Va € I, 


xy o P — (Imà(a)) s9 


(x! -X x?) yl), Mis y®), 2, a), 
aP Tma) 2 — (Rede) al? Gs 
NË (xO J. x) yD... y®), 2,0), E 


gi 


Zz =L(a) oe P(x, es x), y®, pes y), z, a), 


y” =), (a)y ™ as M” (x, ny x? sy) ie sy), Z, a), 


AE PE De ML oa L(a ) being a (d — 2p — = q) x (d—2p—q) matrix with 
C@™ entries (as functions of a) and NY, MC") .P being C® functions of their 
arguments with the extra property that NO), Me) have a zero of third order 
at the origin in the x,y,z variables, for alla € I, while P has a second-order 
zero, at least, at the origin (in the same variables). 


Observation. If p = 0 or q = 0 or d = 2p + q, the above definition makes sense 
in an obvious way by deleting parts of Eq. (5.5.9). 


Vague attractivity near a, may be easily discussed once the equation is in 
normal form with respect to the eigenvalues of L(a) whose real part vanishes 
for a = &c, In general, the equations that one wishes to study will not have 
normal form, but they may acquire such a form after a change of variables. 
This is as suitable for vague-attractivity analysis, by Observation (9), p.391. 

For instance, Eq. (5.1.19) does not have normal form near a, with respect 
to the two complex eigenvalues of L(a). Therefore, before discussing a vague- 
attractivity criterion, it is convenient to remark that there is a simple and 
rather weak sufficient condition for the existence of a system of coordinates 
where the equation x = f (x, œ) assumes normal form. 
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7 Proposition. Let x = f(x,a),f € C® (R? x R), be a differential equation 
parameterized by a and with uniformly bounded trajectories as a € I = (a,b). 
Suppose that f(0,a) = 0 and let L(a) be the stability matrix of 0. 

Suppose that for a € I, L(a) has d pairwise-distinct eigenvalues Ai(qa),... 


? 


Aala), among which 2p are non real; write them as ài (a), ài (@),..., àpl@), 
Ai (@),---; Agla), (2p +q = d), and arrange them so that the functions a > 


Aila) are C® -functions of a, for a e I.° 

(i) There is a (global) coordinate system on RÌ x I: (RÌ x I, Æ), with basis 
R¢x I, denoted (x,a) = B(EM,...,€ 1, ...,9, a’) with a! = a, £) € 
R?,n™ ER, such that in the new coordinates, the equations takes the form, 
(= Tye 5 a 


P =((ReAr(a)) EP — (Im Ar (a) EP) + FP (EM, ..., 0), 
D =((Zm di (a)) EP — (Rela) EP) + FP (EM,...,a), (6.5.10) 
A) =X 1 + BO EO 0), 


where FO, RO), F) are in C® (R? x I) and have a second-order zero at the 
origin in the variables £, n for eacha € I. 

(ii) If An (ae) Æ Aklac) + Ac(ac), k,h, = 1,...,d, and if the equation x = 
f(x,a) has already the form of Eq. (5.5.10) for a € I, there is a coordinate 
system on a suitable neighborhood U x J of (0,ac), (U x J, Æ), such that in 
the new coordinates, Eq. (5.5.10) takes normal form with respect to all the 
eigenvalues of L(a). Calling (B,a’) the new coordinates, the transformation 
Æ can be chosen as 


d 


bj = £j - se Sine(a) £k Xe, a’ =a (5.5.11) 
k,l=1 


with Sike = Sjek E C®(J), a “quadratic change of coordinates”; its inverse 
will (therefore!? ) have the form 


d 
zj = pj + 5 Sikela) Br Be + Gj (B, a), a’ =q, (5.5.12) 


k,l=1 
where G; € C@(E(U x J)) has a third-order zero at B = 0, Va € J. 


Observations. 
(1) Note that by defining, for (j = 1,...,p; h =1,...,q) 


? Since the eigenvalues are supposed to be pairwise distinct and they are roots of a d-th 
order polynomial, this is possible and it follows from general results in Algebra. 
10 By the implicit function theorem, (see Appendix G). 
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Mam +e, a 1g, 
ND (2D) ZO, sates zP) ZP) nO, was n® , a) 


, (5.5.13) 
= FOE, Hi FPE...) 
MP z... 2) ZP 1 2 nM a) = FP ED...) 
Eq. (5.5.10) assumes the more symmetric form 
GLEN ©) D(z) z0) = 
BY) =), (ale? + NY (2, 2/,.. 2, a), SL D 
(a) ( ) J Pp (5.5.14) 


(2) If the eigenvalues »1(@-),A1(@-) are non degenerate and RerAi (ac) = 
0,Zm A1(a-) Æ 0, it follows from (ii) that it will be possible to put the equa- 
tion x = f(x, qa) into normal form with respect to A1(@-), A1(@-) in the sense 
of Definition 6 above. This is obvious if Ay (a.) # An(a@e) + Ac(ac), Vk, h, £, 
but it is also generally true as a consequence of (ii) (see below). 

Suppose, in fact, that the equation has already the form of Eq. (5.5.10). We 
then perform the quadratic change of coordinates that would put into nor- 
mal form, (with respect to all the eigenvalues), the equation obtained from 
Eq. (5.5.10) by replacing the eigenvalues A;(q@), A1(@),...,Ap(@), A1 (Q),.--, 
àla), by Ai(a@),---;Aa(a)) = Aila), àia), A2(@) + e2, Aala) + Ea,---5, 
Ay (a) + €4,---, Ag(a) +E}, where €2,...,€],--. are chosen so that the condi- 
tion A,(a) + An(a) 4 Aela), Vk, h, £ is fulfilled (and the e’, are real). 
Taking into account the quadratic nature of the maps of Eqs. (5.5.11) and 
(5.5.12), it is clear that the original equation will take, in the new coordi- 
nates, normal form with respect to the only two eigenvalues which have not 
been modified, i.e., A1(@), A1(@). If the equation does not have the form of 
Eq. (5.5.10), but Ai(@),A1(@) are non degenerate, one can apply a similar 
argument. 

(3) From the proof, it appears that the normal-form coordinates (for all the 
eigenvalues or, via the previous observations (2) suitably adapted, for some 
of them) can sometimes be found even when the “non resonance condition” 
on the eigenvalues [in (ii) above] is not fulfilled, provided the equation verifies 
additional properties. Such conditions can be explicitly stated by requiring 
that Eq. (5.5.22) below be solvable. In the problems 16 and 17 at the end of 
this section, we give some examples of explicit use of this remark. 


PROOF. To find €,...,€@ ,n™,...,7© coordinates, consider the eigenvec- 
tors (w)(a),...,w(a)) = (VW (a), vY(a),...,v (a), v® (a), v ® (a), 

..,v 9 (a)), of L(a) associated with the eigenvalues (A1 (a), A1(@),--., \p(@), 
Ai (@),---, Ag(@)), a € I, respectively. At fixed a € I, such vectors are linearly 
independent, by the assumption of distinct eigenvalues. 
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We may and shall assume that the above eigenvectors are C™ functions 
of a. Since they form a basis in C?, any x € R? can be written, defining 


9) Ef e0) +i&, as 


8 


q 


x= Y [Cv (a) + TP VO] +Y n v' (ad. (5.5.15) 


j=l h=1 


and, remarking that v? (a), v O) (a) are eigenvectors of L(a), it is immediate 
to check that in the (€,...,€,7%,...,() coordinates, the equation 
x = f(x, a) takes the form of Eq. (5.5.10). 

To prove (ii), write Eq. (5.5.10) as 


d 
= f;( X, a) -Yine a)Tk + XO Fyretnve (5.5.16) 
ke 


with Fyre = Fjeg E€ C®(R4 x J), j =1,...,d. 
Performing the change of Goats in Eqs. (5.5.11) and (5.5.12), after 
some algebra Eq. (5.5.16) becomes, in the new coordinates ĝ, 


d 


d d 
bj =2; — 25° Sikrela)tkie = D Lir(a)brk + 5 { pee: jk Skene (Qe 
ke k=1 hl=1 ` k=1 
‘jnk(Q) Lee — Sjekla)Len la) + Fjne(0, a))} Br Be + G;(B, a) 
(5.5.17) 

where G; has a third-order zero at B = 0, Va € J. 

Therefore, if there is a solution Sja (œ) to the linear system of diy. 
equations in dhe unknowns (recall that Sjn¢ = Sjek, Fjne = Fjek) de- 
scribed for j,h,@=1,...,d by 


d 
5 ( LikSknela) — Sinkl(a)Lre — Sjekla)Lenla)) + Fjne(0,a)=0 (5.5.18) 
k=1 


and if the solution Sj, depends on a in a C% way for œ near œe, then 
Proposition 7 will have been proved. 

Define a matrix W (a) in terms of the eigenvectors of L(a), w® (a), ..., 
w' (a), as 


Wa) S w (a),  h,k=1,...,d. (5.5.19) 


The linear independence of the eigenvectors w implies that det W (a) Æ 0, 
Va € I, so that W(a)~! exists and is a C%-matrix function of a € T, and if 
a matrix A(q) is defined as A(aæ)nk = An(@)dnk, it is 
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L(a) = W(a) Ala) W(a)7" (5.5.20) 


(see Appendix F for details on this relation between a matrix, its eigenvalues, 
and its eigenvectors). Inserting Eq. (5.5.20) into Eq. (5.5.18) one finds 


d 
Yo (W(a)jsAs(a)W (a) 54 Sine(o) — Syne (a)W (a)rs As(a)W (a) je 


k,d=1 
ik (@)W (a) ks As(a@)W (a) 5) (a)) + Fine(0, a) 


and multiplying both sides by W (a), W (a) pW (a) ha summing over j, h, £, 
and setting 


Ospq(@) = X WA) Sene(a)W (a)na W (a) ep 
ARA 
(5.5.21) 
Pspq (a => W (a) 5h  Fiene(O ,&)W (a)na W (a)ep 
UkzL 
one finds that Eq. (5.5.18) becomes 
Pspq + (As(a) — Ap(a) — Aq(@))ospq(a) = 0 (5.5.22) 


which can certainly be solved uniquely for o and via Eq. (5.5.21) yields a C% 
solution to Eq. (5.5.18), Va € J. This solution is real because Eq. (5.5.18) is 
a linear equation with real coefficients and real known terms. mbe 


Observation. Note that in the above proof, the determination of the change of 
coordinates leading to the form of Eq. (5.5.10) only involves the matrix L(a), 
i.e., the first x derivatives at the origin of f(x, a). The definition of Sjela), 
i.e., of the coordinates putting the equation into the normal form, only involves 
L(a) and Fyxe(0, a), i.e., the first and second derivatives of f(x, a) at 0. 


It is now possible to discuss a simple vague-attractivity criterion. 


8 Proposition. Let x = f(x,a), £ € C® (RÌ x R), be a differential equation 
parameterized by a, with uniformly bounded trajectories as a € I = (a,b), 
(see footnote 8) and such that f(0,a) = 0, Ya € I. 

Suppose that for a = ac, the stability matrix of the origin, L(ac), has one pair 
of conjugate imaginary eigenvalues A1(a-), A1(ae) # 0, while all the other d—2 
eigenvalues have negative real parts. 

Also suppose that the equation has normal form with respect to 1, A2 near 
ae E I, and write the differential equations for the first two components of x, 
zı and x2, as 


tı =(ReA1 (a) xı — (Im Ai (a@)) x2) + Ni (£1, 22, y, a), 


=(Zm di(a).n1 + (Red (a)) 22) + Nalen saysa), CSA 
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with Ni, No E€ C® (R2 x R) and having a third-order zero at the origin 2, = 
z2 =0,y = 0, for alla € I, having denoted y the last d— 2 coordinates of x. 


and ; 
If xı + ix2 ef oe” define 


N. N: 
Yo: = lim lim “142+ 72702 (5.5.24) 
o>0y>0 (x? +25)? 
for a€TI. Then the origin is vaguely attractive near a, if 
1 20 
Y=5- | Yo.(9)d0 <0, (5.5.25) 
2T Jo 


while if Y > 0 is is not vaguely attractive. 

The same conclusions can be drawn under the sole assumption that the differ- 
ential equation takes the form of Eq. (5.5.23) without requiring that Ni and 
No be of third order, but only requiring the existence of the limit Eq. (5.5.24), 
i.e., only requiring that xıNı + %2No be of fourth order. 


Observations. 

(1) As already remarked, the assumption on the normality of the equation 
with respect to A1(q@),A1(a@) is not really restrictive if (as assumed above) 
Tm ài (ac) Æ 0 and if all the remaining eigenvalues have a negative real part. 
In fact, one can always change coordinates and put the equation in this form 
(see observation (2), p.394, to Proposition 7). 

(2) The number 7 can, in principle, be computed in any system of coordi- 
nates in terms of the derivatives of first order, second order, and third order 
of f(x, a.) at x = 0, with respect to the x coordinates. However, this calcula- 
tion may be very long in practical cases. For the computation of 7, it is more 
practical to first reduce the equation to the form of Eq (5.5.23) using obser- 
vation (2) to Proposition 7, p.394, and then to compute ¥ via Eq. (5.5.25). 
(3) A similar criterion holds if the equation has one real eigenvalue A’ (a) van- 
ishing at a. while all the others remain with negative real part near a, if 
x = (%1,y) and assuming 


tı = X (a)zı + Ni (21, y, a) (5.5.26) 


with N, having a zero of third order at xı = 0,y = 0, Va € J, then a vague- 
attractivity criterion is that 7 = lim,,—.0 LNG Oe) <0. 

1 
However, the above normal-form assumption, i.e., the assumption that N 
should be of third order, is now restrictive. Sometimes it might be impossible 


to find coordinates in which the equation for xı takes the form of Eq. (5.5.26). 


PROOF. For simplicity, we suppose that the only non real eigenvalues of L(a) 
are A(a) = Ai (a) = o(a) +i ula) = Ao(a) for a near a; we also suppose that 
the other eigenvalues are pairwise distinct and (a) > 0. 

Let v > 0,a > 0 be such that Aj (a),..., Azala) < —v < 0, pla) > 
v, Vae€e (a.-—a,a-+a). We may and shall suppose that the equation takes 
the form 
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tı =o (a)xı — w(a)xg + Ni (21,22, y,Q), 
i2 =u(a)xı + o(a)a2 + Na(z1, £2, y, Q), (5.5.27) 
ý; =à; (a)y; + Ñ;(21,£2,y,0), 
j=1,...,d— 2, with Nı, No having a third-order zero at zı = 0,22 = 0, y = 
0, Va € (ac — a, ac +a), while N; has at least a second-order zero at the same 


point, Va € (a. — a, &c +a), see Proposition 7 (i), p.393. 
By the Lagrange-Taylor theorem, see Appendix B, we can write, for 


2 
N;(£1, £2,y, Q) = 5 Njnkela)En Erze 
h, 


k,l=1 
2 d-2 2 d-2 
i z Njnrela)ensrye + 5 Nyjnne(@)taynye (5.5.28) 
h,k=1 €=1 h k,e=1 
d-2 J 
+ N jnze(@)ynyrye + Nj (x1, 22,9, a), 
h,kjl=1 


where N,N, N',N are C% functions of a € (ac—a, &c +a) and Ñ is a C” 
function of its arguments, for a € (a. — a, Qc + a), and it has a fourth order 
zero at the origin in the 21,22, y variables, Va € (a. — a,a- + a). 


> F def 60: Ay. ds 
For all a € (a. — a,a- + a), if xı +ixe i oe", it is 


2 
jee 

moi Ninel) A. (5.5.29) 

me NaNO GTS oly 


To continue, first assume that ian = Ja < 0, Va E (ae — a, Qe + a), i.e., 
suppose that Ya(0) is 6-independent. This severe restriction will be later re- 
moved. Multiply the Eqs. (5.5.27) by £1, £2, Yj, respectively, and sum the first 


two and, separately, the last d — 2 to find, setting gt Ja? + 22, 
2 —,4 1 2 
sa =0(a)o +7o° + Da, ~— < -vy* + Ds, (5.5.30) 


where Dz and D4 are C® functions of x1, x£2,y and of a € (a. — a, &c + a) 
such that 3 C1, C2 > 0 which, for 0, y near zero, verify 


[D4] < Ci (lyl +) (o+lyl)®, [Dsl < C2 (o + |y)’. (5.5.31) 


Let 3,69 > 0 be such that (1 + B)3C1 < 47|, ¥8?68 > $C2(1 + B)368, and 
let ĝo be so small that for all ô < 69 Eqs. (5.5.31) hold in 
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r(ô) = {x1,22,y| 0 < 4, ly| < 5G}. (5.5.32) 
Then we see that for (41,22, y) E OI (8), 6 < ôo, it is 


2 — 
Fe colaa, itos, 
oe (5.5.33) 
3 dE < aS bor if |y| = 6. 


Hence, if a is very close to a, the right-hand sides of both of Eqs. (5.5.33) are 
negative. We use this to infer in a standard fashion that there is a function 
€5 > 0 such that Va € (ae —€5,a+¢€5), the set I'(d) is Sk) invariant (where, 
as usual, the solution flow for our equation is denoted Ss, In fact, let £5 be 
a monotonically decreasing function of 5 € (0, ĉo] such that o(a) < 4|¥7|6? for 
a € (Qe — E5, Qe + €5). For such values of a, the right-hand sides of both of 
Eqs. (5.5.33) are negative. 

Then let x = (z1, £2,y) € I'(6) and let ¢ = (first time > 0 such that 
si) (x) ¢ (6) and note that either the first or the second of Eqs. (5.5.33) 
(according to which side of ôI (8) is crossed) implies that Sk) (x) g I (ô) for 
some earlier time t < t against the definition of t. So t = +00 and I(6) is 
invariant for gio), Vt>0,V € (ae — E5, Qe + €5), V0 < do. 

To prove vague attractivity, see Definition 5, and Observation (8), p.391, 
it is natural to try to choose U = I (ôo). Therefore, ask the following question: 
given ô < ôo anda € (a.—€5, @-+€5), can we find ts > 0 such that ST (ôo) Cc 
I'(6)? 

Let x € I'(69)/I(6) and suppose that for t in some interval [0, T], Sk) (x) E€ 
T (80)/T (6). If we define 5(t)? $F max(o(t)?, y(t)2/82), the point S® (x) is in 
OI (6(t)) and Eq. (5.5.33), together with the assumption that Vt € [0, T], 6 < 
ô(t) < ĝo, imply 


5(t)? < 6(0)? + 2T max(28?, -58?) < §2—TMs (5.5.34) 
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with Ms > 0.1! Hence, if ts = (52 — 57)/Ms, it follows that T < ts. Hence, 
ST (60) C T(8), Va € (ae — £5, Oe + €5). 

It is now clear that xo = 0 is vaguely attractive. By Definition 5, and 
observation (8), p.391, one can take U = I (ôo), 05,V 0X60, €5,ts as above for 
all ô < dp and Eq. (5.5.7) holds for 6 < do. If 6 > ôo, Eq. (5.5.7) follows from 
the supposed uniform boundedness of the trajectories, 

So the proof of vague attractivity is complete as long as ya (0) = 7a, Va € 
(Qe — a, + a). We must now remove this restriction. 

This will be achieved by studying a coordinate change (x1,22,y,a) —> 
(£1, 62, y’,a’) with a’ =a, y’ = y, and 


2 
& =X; + 5 Aijk(Q)LjLk, i= 1,2 
ait (5.5.35) 
Ti =§j = 5 ijk (Q)EjEk + Fi(€, a), i= 1, 2, 
j,k=1 


where H, dijk are C% functions of their arguments and defined in the neigh- 
borhoods of (0, ac) of the form V x I, V C R? open; furthermore H; have a 
third order zero at € = 0. 

We must show that a;;,(a@) can be so chosen that the two equations 


2 


tı =o (a)xzı — pla)ee + 5 Ninge=i(@) tn tere, 

le (5.5.36) 
t2 =p(a)a1 + o(a)x2 + 5 Nainke=1(Q)EhEkTe, 

h,k=1 


[see Eqs. (5.5.28) and (5.5.29)| are changed into 


11 Here we use a lemma on integration theory: if a,b > 0 are two C°-functions bounded 
below by a positive constant o > 0 and if d(t) = max(a(t), b(t)) and c(t) = a(t) for 
a(t) > b(t), c(t) = b(t) for a(t) < b(t), and c(t) = 4 (a(t) + b(t)) if a(t) = b(t), then 
d(t) = d(0) + fp ce(r)dr < d(0) + tsupo<r< e(T). 

This can be proved by remarking that 


a(t) = lim (a(t) +NN = d(0) iim f < (ar) +e) N) X dr 
aa | 2) POO) ap 
NRA I be)N 


and the function under the integration sign is uniformly bounded in N by the 
maxo<+<+(lå(T)|, |6(7)|) and it is pointwise convergent to c(r). If a(r) = b(r) has only 
a finite number of solutions r, the possibility of taking the limit under the integral 
sign is easily proved. If a(r) = 6(7) has infinitely many roots one can find a sim- 
ple approximation argument, recalling that a,b are bounded below by ø > 0, to infer 
d(t) < d(0) + tsupp<,<; (7). Alternatively, one can apply the dominated convergence 
theorem of Lebesgue. 
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£ =o(a)ér — u(a)£2 + Jaks (ET + 6) + 4th order terms 


(5.5.37) 
é> =p(a)& + olatz + 7,€2 (€7 + E2) + 4th order terms 


In fact, such a change of variables would manifestly change Eq. (5.5.27) into 
an equation of the same normal form but with ya(0) = Fa- 


The existence of such a change of coordinates is easier to discuss after 
; : ‘ d ay ee d ; d 
introducing the variables z el zı +i%2,2 = £1 — i £2, À ef g +i, ¢ ai é& + 


if, C = & — i £2 and writing Eq. (5.5.36) as an equation for z, multiplying 
the second equation by 7 and adding it to the first (see [29]): 


= Na)z + as(a)z? + ag(a)z7Z + ai (a)zZ?? + aola)z*, (5.5.38) 


where ag,...,@3 are complex numbers that can be obtained from the N’s by 
suitable linear combinations. Similarly Eq. (5.5.35) in complex form is: 


C =z + A3(a)z? + Ao(a)C?F + Aj(a)z2? + Ao(a)z°, 


; ni = 2 E (5.5.39) 
z =Ç — As(a)¢° — Az(a)ÇZ — Aı (a) — Aola) + H(C,¢, a) 


with H having a zero of fourth order in |¢| as ¢ > 0, Va € (ae — a, &c + a). 
Note that Eqs. (5.5.38) and (5.5.39) also imply an expression for Ya(0): if 
z = oe’, then 


Yalð) = *Re (a3(a)Zz? + azla)? Z? + ai (a) zZ? + ap(a)z*) 


=Re (a3(a)e?"” + azla) + ay(a)e~2"* + ag(a)e-**8) (5.5.40) 


which follows after some algebra, starting with the observation that (xı Nı + 
t2No2) = Re (ZN), if N N, + iNə denotes the complex combination of the 
nonlinear terms in the right-hand side of Eq. (5.5.36). Hence, 


1 20 
aici «(0)d0 = 
a= zz |, 0d = Rear(a) 


So the goal is to determine A3, A2, A1, Ao in Eq. (5.5.39) so that (5.5.38) 
in the Ç variables has a third-order term of the form a2(a)¢?¢. A calculation 
shows that the equation for ¢ is 


€ =£+3A329s + 2An225 + Anz?z + A147 + 2A1 zz + 3402C 
=. ay + 2AA3) + Tla + (A + ANALY ECE (G7 +A) (5-5-41) 
+ T’ (ao — (A — 3A) Ao) + 4-th order terms 
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hence, we take 


Qi 
Az = "a0 AG We ee = 5.5.42 
E ES Oe. me 
and, near ac the equation becomes 
€ = Ala) C + az(a)ÇC? + 4-th order terms (5.5.43) 


whose Ya(0) function is ya(@) = Ya = Reaa(a). 
The proof that if ¥ > 0 the origin is not vaguely attractive is left as a 
problem for the reader. mbe 


It may be interesting to state explicitly some elementary invariance criteria 
for sets, which have been implicitly proved in the course of the above proof of 
Proposition 8. 


9 Proposition. (i) Let U C R? be an open set with regular boundary OU. 
Then U is invariant for x = f(x), £ € CY (R), if 
f(x) - n(x) <0, Vx e OU, (5.5.44) 


were n(x) is the outer normal to OU in x. 
(ii) Let V € C1(R4) and let U (p) = {x|x € R4, V(x) < u}. Ifx=f(x), fe 
C~(R4), is a differential equation and 


OV 


then the set U (u) is invariant. Furthermore, if 
sup Nj -f(x)=-C <0 (5.5.46) 


SUWA CU), Vrs EE. (5.5.47) 


Observations 

(1)This proposition can be extended to the case when V is “piecewise C% by 
replacing 2V (=) with the set of the convex linear combinations of its extreme 
values (i.e., by a suitable bundle of vectors “pointing out of U(V(x))” in x). 
This is useful because sometimes V may have a square or a cylinder as its level 


surface, as was the case for I (8) after Eq. (5.5.33), where V(x) = max(x7+23). 


5.5 Vague Attractivity of a Stationary Point 403 


Figure 5.2 Geometric interpretation of a differential equation as a vector field. 


(2) The geometric interpretation of a differential equation is the following: at 
every w € R4, draw a vector f(w), i.e., think of f as a “vector field” over R27. 
A solution to x = f(x) is associated with a curve in R? which at every point 
is tangent to the vector field at the same point. This curve is run at a speed 
which at every point is equal to the modulus of the field vector at that point 
and has the same direction, see Fig. 5.2. 

(3) The first statement of Proposition 9 is illustrated in Fig. 5.3. 


OU 


Figure 5.3: A vector field at the boundary of an invariant set U which implies its invariance. 


Observation (1) is illustrated in Fig. 5.4: 


Figure 5.4: As in Fig.5.3 for a set U in with singularities on the boundary. 


In connection with the above remarks, it is useful to see some pictures of a 
vaguely attractive point for an equation x = f(x, a). The “loss of stability” of a 
fixed point for a differential equation depending on a parameter œ consequence 
of the crossing of the imaginary axis of some eigenvalues of the stability matrix 
as q varies is called a bifurcation: hence the above vague attractivity analysis 
deals with examples of vagueley attractive bifurcations. 
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In Figs. 5.5, we draw the vector field for a slightly larger than a,, in a 
typical case of a vaguely attractive bifurcation of the origin. 


Xo% 
X+ 


Figure 5.5 A vector field following a vaguely attractive bifurcation in one real direction: 


two attractive fixed points appear x+ and the bifurcating point remains as a repulsive fixed 


point (xo). 


In Fig. 5.6 a vector field following a bifurcation of the rigid with vaguely 
attractive loss of stability with two imaginary eigenvalues 


-- 


- 
ak 
- 


Figure 5.6 A vector field following a vaguely attractive bifurcation in one complex direc- 
tion: a periodic orbit y appears (solid line) and trajectories starting close to ~y inside or 
outside it spiral towards it (dashed lines) and the bifurcating point remains as a repulsive 


fixed point (xo). 


The reader should try to understand such pictures by trying to draw them 
on the basis of the above information and comments on how a vector field 
should look near a vaguely attractive point. 

Figures 5.5 and 5.6 allow one to see immediately that in the vicinity of a 
vaguely attractive fixed point, there should usually appear two fixed points, 
Fig. 5.5, or a periodic orbit, Fig. 5.6, depending on whether the stability loss 
takes place, as a passes through a, in one real direction or in two complex- 
conjugate directions. 

In fact, this is the essential content of the Hadamard-Perron theorem and 
of the Hopf theorem which we will discuss in the upcoming sections. 

We conclude this section by returning to the problem that we have been 
using to motivate the analysis of this section: the stability of the stationary 
solution @ of Eq. (5.1.19). 
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10 Proposition. Consider Eq. (5.1.19). The stationary solution © is vaguely 
attractive near ac = À + XY 03. 


Observation. The following proof shows that one should not blindly begin to 
compute mechanically the vague-attractivity constant Y,,. The reader will 
note the use of several “tricks” which are not worth being organized in a 
sequence of propositions refining the criterion of Proposition 8, but which, 
nevertheless, make the computation reasonably short. The reader should use 
these tricks in the exercises at the end of this section. 


PROOF. We apply Proposition 8, p.396. Changing variables to bring the fixed 
point to the origin, i.e., (w1, w2,w3)— (01,w2;,r), r = w3 — O3, Eq. (5.1.19) 
lef 


becomes, setting Az “= (Nw? + Mw? + NY r?), 


Wy 


= (a Qe)W D3w3 2X303 TW, — Wer — Aawa, 
We =0zw + (a — Ac)w2 — 2X703 rw + wir — Azwa, (5.5.48) 
t= — (ài + 3X703) akip O3 X — Dor. 


It is convenient to condense Eq. (5.5.48) by introducing 


z d + iwe, yE (a — ac) + i03, 
IE (A +3902), EE NYO — i, 
Q(r, z, Z) “ef A (Re z)? + AZ (Imz)? + Xs r? 
P(r, z,2) “! (X (Re 2)? + XY (Im 2)? + 3X7?) Bs 


Then, multiplying the second of Eqs. (5.5.48) by i and adding it to the first, 
we find that Eq. (5.5.48) becomes 


(5.5.49) 


£=(A—Er—Q)z, =r- P-rQ. (5.5.50) 


To put the above equation in normal form with respect to À, \ change variables 
(see Proposition 7) as: 


C=2-Azr, z= ——— (5.5.51) 


Then the first of Eqs. (5.5.49) becomes 


€ =2(A— Er —Q)—Arz(A— Er- Q)- Az r- P-rQ) 


A z N rO) de (5.5.52) 
ê Er- Q) AAN P Q) det & RAO, 


whose linear and quadratic terms are AC and —(AAr + Er + AAr — AAr)¢ = 
—(Er+AAr)¢, respectively . So we choose A = — E /A and Eq. (5.5.48) acquires 
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normal form in the (C, r) variables with respect to the eigenvalues À, A. From 
Eq. (5.5.52), it is then easy to compute Ya(0): if € = ge”, 


La SONOS A ae =j 
Va(0) = lim [Re >=] | = lim Re (A = Qo + AP) = A)o 
(5.5.53) 
if Qo, Po are Q, P with r = 0. Therefore, 
ON" B2(M. 29 I asd 
HO) Sea? Ooi) se) 
Ay + 3X4 03 
hy EY OR (5.5.54) 
ay 2 - 2 1 2 W3 
= (A3 COS 0+ Ne sin 9) BNR 
which yields 
AS + AY Ar + XY OF 
= A 5.5.55 
Na 2d + BNY 03 E 
and Proposition 10 follows from Proposition 8. mbe 


5.5.1 Exercises 


1. Study the vague attractivity of the fixed point x = 0 of t = ax + f(x), where f € 
C~(R), f(0) = 0, f’(0) = 0. Show that the origin cannot be vaguely attractive unless 


f" (0) =0. 


2. Show, by producing some examples, that if the number 7,,, of Proposition 8 vanishes, 
then the fixed point may or may not be vaguely attractive. (Hint: Find examples other than 
iż =(atipjzt2327,aE R, we Rac = 0,2 E€ C.) 


3. Suppose that the origin is vaguely attractive for + = x f(a?,a) near ac. Show that the 
equation 


é=-pytaf(e?ty’,a), y=patyf(a?t+y’,a), 


also has the origin as a vague attractor near ac, YER. 


4. Let ay E€ C™ (R). Show that the origin is a vague attractor near ae for 4 = —a (x?—a1(a)) 
if ai(ac) = 0. 
5. Given ay E€ C® (R) show that the origin is vaguely attractive for & = —a (x?—aı (a)) (x? — 


a2(a)) near a if a1 (ac) = a2(ac) = 0. (Hint: use Observation (2) to Definition 5, p.391.) 


6. Compute the vague-attractivity indicator Y = Ją, for the origin, see Proposition 8, in 
the equation 


2 


t = -py -r (2? +y? — a (a)),  y=pr- ylz 
assuming u E€ R, aı(&c) = 0,a1 E C®(R). 


7. Same as Problem 6 for 
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2 


& = — py —2(2? +y? — a1 (a)) (£? + y? — aa(a)), 


y spa — y (2? + y? — ar(a))(2” +y? — a2(a)), 
assuming u E€ R, a1, a2 € C® (R), aı(-c) = a2(a-) = 0, study the vague attractivity. (From 
[35]). 


8. Let z = xı + ixz2, A = a+ip,,a,u E R, and consider the differential equation z = 
dz — aazz — z?Z. Apply Proposition 8 to find the vague-attractivity indicator for the origin 
near Qc = 0. (Answer: ¥ = —1). What can be said about the vague attractivity of the origin 
when u = 0? (Warning: Note that the equation does not have normal form.) 


9. Under the assumptions of the first sentence of Proposition 8, only suppose that the 
equation xx = f(x,a) has the form 


tı =o0(a)x1 — (a)r + Sı (£1,%2, y, &) + Ni(x1, 2, y, a) 


(a)xı — o(a)x2 + S2 (£1, 22, y, a) + N2 (£1, £2, y, a) 


t2 =u 
ý =L(a)y + F(21,£2,y,a), 


where à1 (a) = A2 (a) = o(a) +i (a), F has second-order zero at the origin z1 = x2 = 0, y = 
0, for all œ € I, Ny and N2 have a third order zero at the origin for all a € I, and S1, S2 
are homogeneous second-order polynomials in z1, %2, y. All the functions are supposed to 
be of class C™ in their arguments (x1, £2, y,a) € RI x I. 

Suppose, furthermore, F(x, £2,0, a) = 0 and also S1 (x1, 22,0,ac) = S2(£1, £2, 0, &c) =0 
and define ya(@) by Eq. (5.5.24) with the present meaning of the symbols. Show that the 
origin is vaguely attractive near a point a. € I, where (ac) = 0, u(ac) Æ 0, if 7 < 0, in 
spite of the presence of the terms S1, S2. (Hint: Show that the above equation can be put 
into normal form with respect to 1,1 with a change of variables like 


2 d—2 d—2 
6; =x; +) Do Ajentnye + D> BjkhYhYk, k,j = 1,2, 
h=1k=1 h,k=1 


and this change of variables does not affect the value of ya (0) because it changes the third- 
order terms by a quantity vanishing as y — 0.) This extends Problem 8. 


10. Prove that the same conclusions of Proposition 8 hold, replacing the assumption that 
Ni and No are of third order with the assumption that xı Nı + x2 N2 has a fourth-order 
zero at xı = x2 = 0, y = 0, for all a € I. (Hint: Simply go through the proof of Proposition 
8.) 


11. Same as Problem 9, replacing the assumption that Nı and No are of third order with 
the assumption that Ni and No is of fourth order. 


12. Show that the origin is not a vaguely attractive point near ae = 0 for all the values of 
E €C in the equation in R3: 


£= àz + Ezr— 22’, t = =r + zZ, 


where z = x + iy, à = a + iu, u £ 0, x,y, r E€ R3. 


13. Put into normal form the equation 


: : 2 
WY = -W1 — w2 + W1W2, w2 = wi — w2 + wi. 


(Hint: Introduce z = w1 + iwz and change variables as = z + Agz? + A12Z + Aoz?, etc.) 


14. Analyze the vague attractivity of the stationary solution w1 = w2 = 0, w3 = 5a of the 
equation 
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i P 1 1 
w1 = —wW1 — W2W3, baa 2 + W1W3, og == eae 


15. Consider the equation (“Lorenz equation” ) 


t=o(-a+y), ùy =-—or—y-— grz, ż=-—bz+ry-a 


and study the vague attractivity of its fixed points for b = ø = 1. (Hint: For the analysis 
of the fixed point z = —$ at ac = (1 + )b try to use Eqs. (5.5.18) and (5.5.11), with 
Q = &c, j = 1, to put the first equation (in the appropriate variables) into the normal form 
of Eq. (5.5.26); the result will be y = —ø/b. Warning: The analysis of the other fixed points 


is very cumbersome.) 


16. Same as Problem 15 for b = š, o = 10. 


17. Suppose that the equation x = f(x, aœ) has a stationary solution x(a) for 

leae, depending continuously on a. Let L(a), A1(@),...,An(@) be the stability matrix and 
its eigenvalues. Assume that, for a < dc, it is ReA;(a) < 0 and, for a = ac Rerjy (ac) = 0 
for some jo. 

Show that if no eigenvalue actually vanishes at a = ac, (i.e., ZmA;(ac) #0, j = 1,..., 0), 
then the solution x(a) can be continuously continued to a > œe, (i.e., there is a continuous 
function a — x(a) defined in the vicinity of ac, and f(x(a)) = 0). 

Show also that if there is an eigenvalue vanishing at a = ac the solution x(q) will not admit, 
in general, a continuation for a > ac. (Hint: Just use the implicit functions theorem for the 
equation f(x, œ) = 0 near(x(ac), ac); then consider the example f(x) = az +z? +a, x(a) = 


(—a — (—4a + a)2)/2, ac = 0, L(a) = Ma) ~ -y a.) 


5.6 Vague-Attractivity Properties. The Attractive 
Manifold 


Every five years or so, if not more often, someone discovers 
the theorem of Hadamard and Perron, proving it by 
Hadamard’s method or by Perron’s. (Anosov) 


The solution © of Eq. (5.1.19), thought of as a family of differential equa- 
tions parameterized by a parameter a, is, as shown in Proposition 10, p.405, 
§5.5, vaguely attractive near a, = Ay + XY 63. 

Therefore, the motion t — sf) (w),¢ > 0, with initial datum close to @ 
continues to remain quite close to @ if a is near @- in spite of the instability of 
© for a > ae. We shall see that @ is not only unstable, but it also cannot be an 
attractor. Hence, the motions which develop from a datum in the vicinity of 
©, although remaining there, cannot generally have an asymptotic behavior, 
as t > +00, simply given by S;(w) > @. 

In the linear approximation, when the right-hand side of Eq. (5.1.19) is re- 
placed by the function w —> L,(w—@), where La is the stability matrix (5.5.1) 
of Eq. (5.1.19) at ©, the motion is very simple and w3 — @3 exponentially 
fast (~ e~ 1"), while w? + w2 grows exponentially (roughly as e(¢~%)*). 
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The linear approximation is certainly incorrect as soon as w? +w3 becomes 
large, just because @ is vaguely attractive [see Observation (2) to Definition 
5, p.391]. However, one can hope that even in the essentially nonlinear mo- 
tion governed by Eq. (5.1.19), some memory remains of the fact that @ lost 
stability” only in two directions, i.e., only for what concerns the components 
of the motion in the plane w3 = ®3 generated by the eigenvectors v® , v?) 
of the stability matrix of ©. We can then think that the motion following 
a given initial datum w close to @ develops essentially on a two-dimensional 
surface, i.e., that the third a component w3(t) asymptotically tends to become 
a function y(w1(t), w2(t)) of the first two. 

More generally, we can imagine to find ourselves in the following situation, 
to which the upcoming Proposition 11 will refer. 

Let x = f(x,a@) be a R4-valued function in C°(R4 x R) such that the 
differential equations 


x = f(x,a), (5.6.1) 


parameterized by a, have uniformly bounded trajectories as œ varies in I = 
(@-— a, Me +a), a € (0,1), and have the point x = xo as a stationary solution, 
Vael, 


f(xo,a) = 0, Vacl. (5.6.2) 


Let La be the stability matrix in xp and suppose that for a < &e, all its 
eigenvalues \1(@),...,Aa(@) have a negative real part, while for a € I = 
(Q.— 4, % +a), only d—r eigenvalues have real parts less or equal to —vo < 0, 
the others having real parts larger or equal than —v > —vp and vanishing 
for a = œe (ie., RerA;(a) = 0 for j =1,...,1r). 

For simplicity, also suppose that the eigenvalues of La are pairwise dis- 
tinct: so we can choose the eigenvalues à; (a), . . . , Aa(a) and the corresponding 
eigenvectors v\),...,v( so that they are C(I) functions of a and so that 
every complex eigenvector appears together with a complex conjugate eigen- 
vector. We suppose that the eigenvectors and eigenvalues have been so chosen 
and enumerated. 

Under the above assumptions, we may assume without further loss of 
generality that for a € I, the equation takes the form of Eq. (5.5.10) and that 
the first r equations describe the evolution of the coordinates relative to the 
real plane generated by the “unstable” directions v™),...,v ) corresponding 
to the eigenvalues with large real part (> —1(). Were this not true, we could 
change the coordinates near xo (see Proposition 8, p.396) to make this true. 
Finally, suppose xp vaguely attractive for Eq. (5.6.1) near a, and denote 


U, r (8) (5.6.3) 


a system of neighborhoods associated to xg for a € JI, whose existence is 
guaranteed by Definition 5, p.390, of vague attractivity. 
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From the discussion of the preceding section, it appears that the just de- 
scribed situation can be realized for the Eq. (5.1.19), see Proposition 10, p.405, 
§5.5, thought of as a family of differential equations parameterized by a: thus 
this provides a concrete example to which the following theory can be applied. 

We now formulate a proposition giving a positive answer to the conjecture 
hinted at above, that given a > a, close enough to œe any motion of Eq. (5.6.1) 
starting close enough to xo remains close to xo (since Xo is vaguely attractive) 
and, furthermore, it can be thought of as developing asymptotically, for t —> 
+00 , on an invariant surface oq. 

Such a surface will have dimension r and it will be tangent to the “insta- 
bility’s hyperplane,” £r+1 =... = £a = 0; furthermore, it will be an attractor 
for the motions starting in U and its attraction strength will be exponential 
and roughly measured, as in the linear case, by the parameter 


The surface oq will generally be non unique since, as we shall see, it may 
contain other smaller attractors: if Ag is a minimal attractor in og which has 
U as its attraction basin, then, clearly, every invariant hypersurface o’ C U 
containing Aa is an attractor for U; see the exercises at the end of this section. 

Finally, the surface oa will be described inside the neighborhood I (xo, 6) = 
{cube centered in xp and side 26} by d— r functions on R” x R of preassigned 
regularity C\), (k =0,1,...), via the equations 


Crp. = pt) (x1, Jubail pe anes Bae — p (x1, “nae ny vanes (5.6.4) 
(where we suppose xo = 0) provided 6 is small and a is close to œe. For a close 
to a, this surface will be almost flat: if k > 2, this means that the first deriva- 


tives of the functions in Eq. (5.6.4) vanish for (x1,..., £r) = (@o1,---, Zor): 


The interest in the above considerations is that it will become possible to 
analyze the asymptotic behavior of some properties of the motions originating 
near a vaguely attractive point x9, as t —> +00, reducing the d equations of 
Eq. (5.6.1) to the r equations, labeled by j = 1,2,...,7, 


= f(x1,..., 8r, PCT) (21,...,2r,@),..., £a = PO(z1,..., 2r, 0), @). 
(5.6.5) 

When d is large and r is small, this may be a very important simplification. 
When r = 1 or r = 2, this will say that the motion near the vaguely 
attractive point is a “one-dimensional” or “two-dimensional” problem. Figures 
5.5-5.6 already suggest that in such cases it will be possible to obtain deeper 
insights into the theory of the asymptotic behavior of the solutions of the 
equations starting with initial data close to x9. They even suggest the results 

of such a theory (see Figs. 5.5 and 5.6 and §5.7). 
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In the case of Eq. (5.1.19), it is r = 2 and, therefore, the three equations 
(5.1.19) can be reduced, for a — a, small and for the purposes of the analysis 
of some asymptotic properties, to the first two equations with w3 replaced by 


w3 = p(w1,w2,a) + Ws (5.6.6) 


where ¢ is a suitable C) function (with a preassigned k) having a second- 
order zero in w1,w2 = 0, i.e., such that there exist Y1, Y2, Y3 € CH7? and 


p(w1, w2,a) = wip (wi, we, a) + wi ypolw, w2, a) + w1w37)3(W1,W2,a) (5.6.7) 


expressing the tangency of the surfaces ca to the instability plane in ©, (w3 = 
@3 in this case) provided k > 2. For k < 2 the near flatness can be expressed 
by Eq.(5.6.11) below (implying Eq.(5.6.7) for k > 2). 

A simple consequence of this, as we shall see in §5.7 (see footnote 15 
on p.431), will be that for a close to ac, @ > Qe, there is a periodic orbit 
which is a normal and minimal attractor lying on og with attraction basin 
U/C(@), where C(@) is a one-dimensional curve of points w through ©, whose 


asymptotic behavior is Sk (w) q7 &. Hence, it will be possible to draw a 
rather complete picture of the motion near @. 


A precise statement about the above matters is as follows. 


11 Proposition. Under the assumptions described in the above text between 
Eqs. (5.6.1) and (5.6.3), consider the symbols introduced there and let, for 
notational simplicity, xp = 0. 

Given k > 0,C > 0, there exist positive constants a+, ô, ðo, V € (0,1) with 
do < ôa} <aandd-—r functions of class C™ , denoted p**) ...,p , of 


ther + 1 variables 11,...,2,,a defined for 
ô 
|x| < zy Í= 1,...,7, QE (Qe — a4, Qc +a) (5.6.8) 


such that the surfaces oa C RÌ described by Eqs. (5.6.4) have for alla € I 
the properties: 
(i) “local invariance “: 


S (aN P(60)) Coa, Vt > 0; (5.6.9) 
(it) “local attractivity”: there exist C" > 0 such that for all w € U it is 

d(S\(w),ca)<C'ev#, = WED (5.6.10) 
(iii) “tangency” and “flatness”: Vj =1,...,7, 


|p (a1,...,2r)| < C (a2 +... +22). (5.6.11) 
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Observations. 

(1) The reader may be surprised by the fact that, for the first time in this 
book, an important property is appearing and being considered in class C\) 
rather than in class °°: the reason is due to the fact that in Proposition 11 
one cannot choose k = +o0. In fact using the methods of Problems 3 and 4, 
p.428, the reader will check that in the equation t = ax — z3, 4 = —z + 2? 
the surface og cannot be of class CCF) for k > x. 

(2) The above proposition is an important part of the “Hadamard and Per- 
ron theorem”. It is sometimes called the “invariant” or “attractive manifold” 
theorem and it has importance in the development of the qualitative theory 
of differential equations. It has been intensely studied, undergoing many ex- 
tensions and generalizations, often trivial but sometimes significant, [22]. 

(3) The family of surfaces ca is generally far from being uniquely determined 
by Eq. (5.6.1) (see the exercises for §5.6). 

(4) The length of the proof and its formulae look quite discouraging. Ac- 
tually the proof that follows is quite diluted and detailed (to conform to 
the spirit of this book). The subsections 5.6.A-5.6.D below have only a no- 
tational and definitorial character. The first technical step is in subsection 
5.6.E with an application of the implicit function theorem with the purpose 
of stressing some properties of the surfaces o(7;) approximating, as t — +00, 
the surfaces that we are looking for. Subsection 5.6.F collects all the pre- 
ceding inequalities to obtain further properties of the approximating surfaces 
a(n) for “very small” t. Furthermore, it contains the two basic ideas of the 
proof: (i) the estimates for very short times are possible because the quantity 
Vo = Minj=r4+1,....d — Re à; > 0, measuring the attractivity of the stable direc- 
tions, is much larger than all the other relevant quantities (i.e., for short times, 
the “strong attractivity of the stable directions prevails over the weak repul- 
sivity of the unstable ones“); and (ii) the long-time estimates, as t — +00, can 
be obtained from the ones for short times taking advantage of the autonomy 
of the equation. 

These two themes occur again in a more or less repetitive way in subsections 
5.6.G-5.6.N, all very similar to each other and which have been included here 
only for completeness. 

The formulae are quite long and they could certainly be simplified and writ- 
ten more compactly. However, they are obtained by applying the procedures 
suggested in the text and they are left in the form in which they are con- 
structed: in this way, the reader may easily recognize their various parts and 
their origin and this, perhaps, makes the proof more clear. 

The vague attractivity assumption is used at the beginning of the proof to 
reduce it to an equivalent problem. 

The proof says much more than what is stated in Proposition 11 and some of 
its corollaries are described in the problems at the end of the section. 

The proof is adapted from that of Lanford.!” 


12 See [29] 
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PROOF. We shall discuss the proof of this proposition in the apparently par- 
ticular case when d = 2,r = 1, and the equation is 
&=ar+P(a,z), 2=-Mz24+ Q(z, z), (5.6.12) 


where vo > 0 and P,Q are two C®(R?) functions with a second-order zero 
at the origin: 


P(a, z) =2?P, (2, z) + 2*Po(a, z) + xzP3(z, z) 
Q(z, z) =27Q, (a, z) + 27Qo(x,z) + 2Q,(z, 2) 


and P;,Q;, i = 1,2,3, are in C®(R?), see Appendix B. 
The stability matrix of 0 € R? is 


La = é ) (5.6.14) 


—vo 


(5.6.13) 


and we suppose that x9 = 0 is vaguely attractive near a, = 0. 

This case looks quite special; however, its theory forces us to deal with all 
the difficulties of the general problem whose analysis is a repetition of that 
relative to Eq. (5.6.12). In the following formulae, it will essentially suffice to 
think that x and z are vectors with r and d — r components and that a, vo 
are matrices r x r or (d—r) x (d — r), respectively, possibly functions of the 
parameter a. The first will be a matrix with eigenvalues all having real part 
not less than =v) > —r, Y € I = (ae — a,a- + a) and vanishing for a = a, 
and the second with all the eigenvalues with real part not exceeding —vo < 0, 
Va € I. Furthermore, P,Q also will have to be thought of as depending 
(smoothly) on a. 

Hence, consideration of Eq. (5.6.12) does not diminish the real difficulties 
of the problem and treating it avoids puzzling with fictitious (mainly nota- 
tional) difficulties the reader in his first approach to a proof which is complex, 
although quite natural in its development. 

The interested reader will not have difficulties, on a second reading, in 
interpreting (mutatis mutandis) the proofs as relative to the general case (see 
exercises and problems at the end of this section for hints and suggestions). 

To make the analysis of the proof easier, it will be divided it in various 
basic steps distinguished by alphabetic characters. 


5.6.1 A: Preliminary Considerations and an Equivalent Problem. 


Consider Eq. (5.6.12) and let U be the neighborhood introduced in Eq. (5.5.7), 
whose existence is guaranteed by the vague-attractivity assumption. 

Let I'(g) = {square in R? with side size 20, centered at the origin} = 
{wl we R?, |wi] < o,i = 1,2}. 

Choose k = 0, first. The case k > 0 will be discussed later. Fix C € (0,1). 
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Let 5,a+,to € (0,1) be small enough so that a+ < $a and the inequalities 
in Eqs. (5.6.41), (5.6.42), (5.6.43), footnote 13 in p.418, (5.6.53), (5.6.61), 
(5.6.76), (5.6.83), and (5.6.84) that will be met in the following discussion are 
satisfied. It is not worth listing them explicitly a priori here. The only fact 
that we shall really need is that they can all be simultaneously satisfied by 
choosing 6, a+, to small enough, once C < 1 is given. 

Without loss of generality, we also suppose (see Definition 5, p.390, §5.5) 
that for alla € I = (a. — a, &c +a), 


rcu, suc rÈ), Yt> ts, 
‘ (5.6.15) 
ST (So) C LG) Vt>0 


for a suitable choice of ts > 0 and of ĝo < ô. 
Let ys be a C®(R?) function which takes values between 0 and 1, and 
has value 1 on I'($6) and value 0 outside T'(36). Let xs have the form 


xls, 2) =x(5, $), (5.6.16) 

where x € C®(R?) is 1 on I'($) and 0 outside T ($). 

So every motion beginning in U enters T'(48), for 

and every motion beginning in (ĝo never leaves T 
of the vague-attractivity assumption. 

It will then suffice to prove Proposition 11 for the equations 


good, in a finite time t5, 


45). This is a consequence 


2 


t = x5(2, z) (ax + P(a,z))“! X5(a, z,a), (5.6.17) 
ż = =z + X(x, z) Q(x, z) = —vo + Z s(x, 2, 2), (5.6.18) 
It is useful to remark, for later use, that for the given values of a+, ô, ôo, C, k, 


and since xs vanishes outside ['(36), solutions t > Sk) (w) of Eqs. (5.6.17), 


(5.6.18) with initial datum w € I'(d) remain in I"(6): just note that 


gfe) (x,z) = ager") (5.6.19) 


as long as x(x, ze”) = 0. 


5.6.2 B: Some Useful Estimates of Derivatives. 


Certain properties of solutions of Eqs. (5.6.17) and (5.6.18), thought of as an 
equation depending on the parameters a and 6 and with datum w € I(0) 
will be needed. The properties are summarized as follows. There exists a 
constant M > 1 and to € (0,1) such that, Vt € [—to, to], Va € (—a,a), Yô € 
(0,1), Yj = 1,2, 


asie) (wi, we)i 


— eHt§,,] < MIe|(\a| +ô), (5.6.20) 
Ow; 


5.6 Vague-Attractivity Properties. The Attractive Manifold 415 
(a6) 
28 Gera) < M(ôltlői + 8 It\5.2), (5.6.21) 
Oa 

asi?) (wi, we)i 
A 
where py = 0,2 = v, and where we have set (x,z) = w, w = (wi, wo), 
and have denoted the components of gf) (w) as Sk) (w) i = 1,2. Such 

notations will be often used in the following. 

The above inequalities follow from an analysis of the regularity theorem 
for differential equations, §2.4, and they will be left to the reader, except Eq. 
(5.6.20) which is proved, as an example, in Appendix L. 

We shall also need the following estimates, consequences of the definitions 
in Eqs. (5.6.17),(5.6.18). Let w € I'(0), |a| < a+,ô < 1, i = 1,2; then 


+ Hiwil < M |t| (Ja + 6) O41 + 66:2), (5.6.22) 


Emmi (lal + ô), Ao) < ma, (5.6.23) 
Q 

A] < M6, | Aste) | =0, and (5.6.24) 

IZs(w,a) < M |w, — [Xs(w,a)| < M (laljur|+|wl), (6.6.25) 


where M can and will be chosen the same as before, possibly increasing the 
latter. 


5.6.3 C: Definition of the Approximate Surfaces. 


Let m E€ C™(R) be such that, Vx € [—d, ð], it is |r(x)| < ô. Interpret it as 
defining a surface (a curve in this case, actually) o(7) C I (8) of parametric 
equations 


z=7(2), x € [—d, ô]. Also suppose that (5.6.26) 
p <Cvé, xe [-ô,ð] (5.6.27) 


(this choice of a bound on 2ra) 
replaced by Côf, 0 < 8 <1). 
Then by the invariance of I'(ô), the set gio ® (o (T)), t > 0, is contained 
in I (ô) and, as will be seen shortly, it is a surface of the form o(7;), where m 
is a new function verifying Eq. (5.6.27) and |r| < ô. 
It is then natural to try to define the surface that we are looking for as 
the surface o (Tso), where 


is quite arbitrary: CV could equally well be 


Too = lim m (5.6.28) 
t+ 


if this limit exists. In this case, in fact, the relation Sk?) (olro )) = O (Tæ) 
will be formally true. 


416 5 Stability Properties for Dissipative and Conservative Systems 
5.6.4 D: Proof that the Approximate Surfaces are Well Defined. 
First we look for an expression for m+. This function should be defined by 


(x, me) = Sf" (x0, (x0), (5.6.29) 


where xo is a suitable point in [—0, 6] defined, naturally, by Eq. (5.6.29) which 
should be thought of as an equation defining m; and zo in terms of x and 7. 
Such an equation certainly has a solution since 


Sk) (46, r(£6)) = (£6, n(+8)e 7t) (5.6.30) 
and, therefore, by continuity, there exists a “function” A(a,t,a,7) such that 
the abscissa of SP (A(z, t, a, T), (A(x, t,a,7))) is just z, i.e., 


zo = A(x, t,a,7) (5.6.31) 


is the solution of the first equation obtained by equations the first compo- 
nent of Eq. (5.6.29). Then m(x) can be defined as the second coordinate of 
gio?) (zo, 7(%o) with zo given by Eq. (5.6.31). 

By Eq. (5.6.19), one naturally sets A(x, t,a, r) = x for |x| > ô. 

It is not immediately clear from the above argument that the functions A 
and m are uniquely defined. To this question we devote the next step. 


5.6.5 E: Alternative Proof of the Existence of m: Its Uniqueness 
for t Small and Estimates of Its Derivatives for t Small. 


As already noted, the argument in subsection 5.6.D does not prove uniqueness 
of m; nor does it allow to estimate its x derivative when one tries to check if it 
still verifies an inequality of the type of Eq. (5.6.27). In fact, it is a superfluous 
argument introduced just to help the reader to visualize what is done below. 

It is possible to prove constructively the existence and uniqueness of the 
function A and, at the same time, to obtain an estimate of the derivatives of 
A with respect to x,t,a by using the implicit function theorem. To study the 
function A in this way, write Eq. (5.6.29) as 


t 
x= T0 +f X5(Sk (x0, m(ao)), adr, (5.6.32) 
0 


t 
m(x) = 7(20) +f A A 500?) (zo, T(x0)), adr, (5.6.33) 
0 


obtained from Eqs. (5.6.17) and (5.6.18), pretending that Xs and Zs are 
“known functions” of t and thinking of them as linear equations. We write 
Eq. (5.6.32) in the form G(x, £o, œ, t) = 0, where 


t 
G,(x,29,a,t) = £o +f xX (5 (zo, 7(Xo0)), adr, (5.6.34) 
0 
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is a function in C®(R4) which will be mainly considered for |x| < ô, |a| < 
2a+, |xo| < ô, |¢| < to. 

We regard Gr (2, £o, a,t) = 0 as an equation for zo parameterized by x, a, t 
at fixed a. 

Since the point (%,%,@,0) is a solution point of our equation, VZ € 
[—6d, ô], V@, |@| < 2a}, we apply the implicit function theorem, see Appendix 
G, Eq. (G10), to find a square neighborhood with side 0(%,@) of (%,@,0) in 
R? such that if 


|x 7s z|, |a J al, lel < o(T, a) (5.6.35) 


then Gr(z, £o, Q, t) = 0 has a solution zo € [—ô, ô]. 
To prove the existence of o(%,@), we must study the derivative 


ues (x, z0, @, t). (5.6.36) 
Oxo 


From Eq. (5.6.34), using Eqs. (5.6.20), (5.6.23), (5.6.26), (5.6.27), and also 
recalling that C < 1, 6 < 1 (so that C6 < 1), one finds 


OG IX5(Sk) (ao, (x0), a) 
a en 1|= a ro ar (5.6.37) 


< 8M (a| + 8)(1 + M|t|(lal + 4)) |t]. Furthermore 


OCF (e, o,a, t) =1, (5.6.38) 


and, setting (t) = E(a, ô, £o, a, t) = i sie l ® (ao, m(x9)), to simplify notations, 


reoeo] =| f (E EARO, PEL 


< |t|(la| + 6) d|¢|M? + 6?|t|? M? (lal + 6 + Md), and, finally, (5.6.39) 
[Pee 
at 


The above inequalities for the derivatives are valid for all |t| < 1,|z| < 
d, |vo| < 6, 6 < 1. Assume now a4, ô, to so small that V |a| < 2a4, |t| < to: 


(x,270,0,0) = |X5(€(a,6,29,0,7))| < 2Mô(ô + al). (5.6.40) 


OG, 1 

Bang (2 B01 Ot) — i < 10 M (2a + ô)lt] <5, (5.6.41) 
R 1 

AO" (x, 20,048) <2Môlt| < (5.6.42) 

0G, 1 

<= (e, 20,0, 1)| < M6(6+2a,) <=. (5.6.43) 
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Here 4 and a are arbitrary small numbers, convenient for the upcoming 


estimates. Then if |a| < a4, |x — T| < 36, and if ¢ = min(a4, to46) the o(Z,a) 
just considered can be taken (see Appendix G, Proposition 1) as 


OG 
Tana: g min Bae | &. 
o(Z, a) = 2 max (S| + [Sz] + =] 2 PED) BEY aaa (5.6.44) 


having used Eqs. (5.6.41)-(5.6.43) to get the right-hand side inequality and 
having considered the maxima and the minima with respect to the parameters 
t, a, £, xo as they vary in [—to, to], [-2a+, 2a+], [—d, 6], [—d, ôl. 

This shows the existence of A as a function of x, œ,t as they vary in! 


nl <6, alsan Hse, (5.6.45) 
and shows, as well, the possibility of estimating the derivatives of A as follows 


(see Eqs. (5.6.41)-(5.6.43) right-hand sides and Appendix G, Proposition 1): 


OA(zx, t, a, T) 2n lE 20,0) 
Sa i|- an lOM aa), (6.6.46) 
xo 
OA(x,t, a, 7) pa ERCA] 
ee - rera < 4M |tļô, (5.6.47) 
To 
Alz tar), | ĉEezoat) 
S = | - amaan SMF Ca +8) (5.6.48) 
0x0 


valid for x,a, t in the region of Eq. (5.6.45). It is important to stress that Eqs. 
(5.6.46)-(5.6.48) have been obtained independently of the choice of 7 provided 


13 Note that, if 2toM (a4 + 5)6 < min(a4, to, $) = ¢, for |x| > 26 the determination of A 
is trivial and A(z,t,a,7) = x. Then let 7 € [-36, 36), @ € [-a+;,a+4] and remark that 
by Eqs. (5.6.35) and (5. 4 44) it is pose to P uniquely the equation for A € [—¢, ¢] 
in the region |x — z| < $. la- a| < +: |t| < $ As @,@ vary in [-36, 35] x [-a+, a4], 
this parallelepipedal region covers at least a neighborhood V of [—26, 26] x [-a,,a4] x 
[-<, £]. 

By the uniqueness of A in each parallelepiped, the functions A thus defined coincide at 
the points which are common to several parallelepipeds. Furthermore, the functions A 
have a value equal to x for |z| > 26. 

Hence, we have built a continuous piecewise-differentiable solution A of Gr (x, A, œ, t) = 0 
in the region of Eq. (5.6.45); and by construction, A is the unique solution, with the 
property |A — x| < ¢, in this region. 

Actually, A must be C° in the region of Eq. (5.6.45), since in each of the parallelepipeds 
where A has been constructed, A has this property and we have uniqueness. 

Finally A is the only solution with |A| < 6 because, as noted above, any such solution 
must verify |A—2z|<¢< tô and for |x| > 26 itis A= zx. 
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In(x)| <6, and eae) <CVé, Vxre[-6,5]. (5.6.49) 

The above considerations show that the function 7; is well defined at least 
for |a| < a4, |t| < t+, via Eqs. (5.6.29) and (5.6.31). 

The uniqueness of the A function, coming from its construction (see foot- 
note 13, p. 417) allows us to conclude that 7; is the sho) image of o(7): 
Sk) a (a) = o(m). Also note that, by the invariance of (6), one has 
Sk) olr) Cc (6). 

The invariance of I'(5) for the motions generated by Eqs. (5.6.17) and 
(5.6.18) also implies that 7;, verifies the first of Eqs. (5.6.49) (a property 
already encountered during the construction of A). 


5.6.6 F: Check of the Validity of Eq. (5.6.49) for m:,0 < t < t4 


This check is of fundamental importance since it will allow us to define 7; for 
allt > 0. 

The relation Sk) a (a) = o(m), t € [0,t+] will guarantee, taking also 
into account the group property $(0) st) = so); that if t € [0, t+), t € 
[0, t4), t+ ť € [0, t4] and if m, ty, 744 verify Eq. (5.6.49), then 


(miw = Ttt. (5.6.50) 


This relation will allow us to define uniquely m, Vt > 0, by dividing the in- 
terval [0, t| into intervals with amplitude 7 < t, and, then, recursively setting 


me = (irr )r = ((M-27)r)r- (5.6.51) 
The definition will necessarily coincide with the one that could be given by 
setting Sk) a (a) =o(m), t> 0. 
Therefore let us verify that, if O < t < t4, m+ fulfills the second of Eqs. 
(5.6.49) (as noted above, the first has already been checked). 
For this purpose, we use Eq. (5.6.33), where instead of 29, one should 
imagine A(z,t,a,7). Differentiating both sides, one finds 


Om, — Om1(%0) OA t 2 f Oia O84 ala,ð) 
Be ~~ Bay Bee Ep GG OPM orate) a) 


dr 


(5.6.52) 
with slightly symbolic differentiation notations (hopefully self-explanatory). 
By Eqs. (5.6.41), (5.6.46), (5.6.49), and (5.6.20), (5.6.24), Eq. (5.6.52) implies, 
with some labor, that Vt € [0, t4], Va € [-a4, a4], Vx € [-ô, ô], 


! foe , O82" (xo, m0): aaa OAL, T, 04,7) 
Oxo On (x0) Oxo Ox 
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2z 
Ox 

+ mòl + Mt(a, +6))+Mt(a,+5)CV54+ Mt (ay + 6)} 

= CV5(1+ Mt (a, +6))-(1+2— M (2a, + ô)t) 

n OVõle=o(1 +20 M t(2a, +6)) +tMC-1V5 (5.6.53) 


| < CV6e~"(1 + 20 M (2a, + 5)t) 


x [a+ Mt (a+ +6)) + Mta} +8)CV5+Mt(a, +ô) 


+OVO(L+Mt(a, + 6) (1+ 20M (2a, + ô)) 


x (1+ 20M (2a, +8))} < CV5(1- 25) 
if ô, a4, to (recall that t+} < to) are supposed to have been so chosen that the 
last inequality in Eq. (5.6.53) holds, V t € [0, t+].14 

The above arguments prove that 7; can be defined by Sk) g(r) =o(m;) 
or, equivalently, by Eq. (5.6.51), for t > 0 and show that 7; verifies Eq. (5.6.49) 
for all t > 0. 


5.6.7 G: Proof of the Existence of the Limit as t — +020 of mnt for 
t € [0, t4]. 


We shall proceed by recursively evaluating 


ltt — T(n—1)el| = m [Tnt — T(n—1)e(@)| (5.6.54) 


and show that the series 


XO || nt — T(n—r)el] < +00 (5.6.55) 
n=0 


converges. This implies that mng converges uniformly as n — +00 to a limit. 

To study the series of Eq. (5.6.55), consider two functions m,m’ verify- 
ing Eq. (5.6.49) and, through them, construct the functions A(x, t,a,7) and 
A(a,t,a,7’) defined on the set given by Eq. (5.6.45), solving the equations 
for xo: Gr(x,20,t,a) = 0 and G,/(x,x0,t, a) = 0 as indicated in subsection 
5.6.E. 

Shortening A(x, t,a,7) and A(x,t,a,7’) in £o, xh, respectively, and using 
Eq. (5.6.33), one then has Vt € (0, t4], 


14 One sees that C6 could be replaced by C7, y < 1. The choice y = 1 could only be 
made if vo is large enough (or if we decided to allow C > 1 and C to be large enough.) 
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t 
Ime(e) — a5 (2)| < et n(2o) — a! (0h)| + i} dre-vot-") 
0 


(5.6.56) 
-|Z5(S6) (zo, (20), a) — Z5(S6% (2), 2 (2h)), a)l 
which, by Eqs. (5.6.24), (5.6.49), and (5.6.20), implies 
me — mi(x)| < e-¥*(|n (x0) — n'(2))| + [a (wo) — 2! (2')1) 
t 2 
+ | M55 > |S) (0, n(20))i — SCP (2h, a! (ah) al dr 
0 i=l 
t 
<e~t(|In — m'|| + CVS |ax0 — 24) +f te (5.6.57) 
0 


2M6(1 + Mõla + 4)r)(|z0 — zo| + |7(xo) — 7'(z£0)|) 
< |r — m|| (e7 "| + 2M6t(1 + Måla, + 6)t)) 
+ |£o — zhI(CV 5e ™* + 2MSt(1 + Md(a, + 6)t)). 


We must therefore estimate |x — xọ|. Remark that (xo, 7(xo)) and (xo, m'(zo)) 
are the values of S&°) (x, m(a)) and SLÀ (x, m(x)) hence, as in Eq. (5.6.32), 


t 
ee f dr X5(S°%°) (x, m:(a)), a), 
0 (5.6.58) 


t 
ee if dr Xs (SSP (æ, m4(2)), a), 
0 


Then, by Eqs. (5.6.23) and (5.6.20), 


Ivo — z| < if dr |X5(S° (x, m(2)), a) — Xa (S? (w, w4(2)), a) 
< f iM EE E EE A (5.6.59) 


<tM(ay + 8)1(1 + M tlas +6))im(x) — mi(2)|, 
Hence Eqs. (5.6.57) and (5.6.59) imply 


|re(x) — mi (x)| < || — 2’ ||(e7”* + 2MSt(1 + Md(ay + 6)t)) 
+ (CV õe ™™* + 2Mt(1 + Md(ay + 6)t)) (5.6.60) 
x (2M (a+6)t(1 + Mt(az + 5))) |e (x) — m, (£). 


This formula implies a bound on |m;(a) — m;(x)| if a}, ô, to are so small that 
for all t, O < t < tp holds the inequality 
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1—2M(a,6)t(1 + Mt(az + ô)) (5.6.61) 


x (CVde™™ + 2Môt (1+ Mö(a+ + 8)t)) < (1 — 25), 


Equations (5.6.61) and (5.6.60) imply |7:(x) — m(x)| < (1 — 4vot)||r — r'|]; 
hence, Vae [—a1, a+], Vte (0, t4], 


Vot 
m = mI] < a- lr — 2". (5.6.62) 


A similar calculation would allow us to show that if m verifies Eq. (5.6.49) 
and a+, ô, to are sufficiently small, 


m — te || < yit- t'l (5.6.63) 


for all a € |[-a+, a4], Yt, t E R4, |t —t’| < t+ provided y is suitably chosen. 
We shall use this inequality without proof here (see Appendix M where a 
proof is discussed and an explicit expression for y is exhibited). 
Equation (5.6.62) allows us to estimate recursively Eq. (5.6.54) since it 
holds under the sole assumption that m and n’ verify Eq. (5.6.49) and t € 
[0, t+], œ € [-a+,a4+]. By subsection 5.6.F, one finds 


c pot 


Vot n—-1 
[lant — TM(n—vell < (A 5 ) 


Im: — 7 || < 26 (1 — a (5.6.64) 
valid for all m verifying Eq. (5.6.49), Vn integer and > 1. 

Hence, the series of Eq. (5.6.55) is uniformly convergent as 7 varies in the 
class of the functions verifying Eq. (5.6.49), Vt € [0, t+], Va € [-a4, a4]. 


5.6.8 H: Independence of the Limit as n — +2 of 7, from nm and 
te [0, t+] 


Denote m the continuous function defined on [—6, 6], Vt € [0, t+], in terms of 
a m verifying Eq. (5.6.49), by 


lim Trt = Too,t,n; (5.6.65) 


n—+0o0 
the continuity being insured by the uniformity of the limit of Eq. (5.6.65), 
see Eq. (5.6.55). The function 7.5,4,, is m independent. In fact, Eq. (5.6.62) 
recursively implies 


0 (5.6.66) 


Vot 
[Tnt — Tall < A- a esah 


if m,n’ verify Eq. (5.6.49). Hence it will be simply denoted as m. Now let 
t,t € [0,t,] and t/t = p/q = rational number, p,q integers, then 
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Tntp = TMnt'qs (5.6.67) 
hence, in the limit n — +00, Eq. (5.6.67) implies 


Toot = Too, t! (5.6.68) 


if t/t! = {rational number}. Therefore Eq. (5.6.63) implies that Eq. (5.6.68) 
holds for all t,t’ € (0, t+] and To, is t independent. Denoting Too the function 
in Eq. (5.6.68), it is (7o.)t = Moo and this proves the invariance of (7); 
hence, Eq. (5.6.9). 


5.6.9 I: Attractivity of o (Tə). 


Given (%,Z) € I (ô), let 7 be a function verifying Eq. (5.6.49) and 7(%) = Z, 
e.g., T(Z) = Z, x € [66]. Given t > 2t4 , let t € (0,t4) such that t > $t+ 
and, furthermore, t/t = N = integer. Then, by Eq. (5.6.66) or Eq. (5.6.62), 


[Tt — Tool] = [le — (To) 
vot 


= ||) xe = (tn) yall < A-SI = mal (5.6.69) 


t t 
sesa 


which proves that o(7;), hence go) (Z,Z) as well, approaches o(7.) with 
exponential strength so that the attractivity of o(7..) is proved in the case 
of the Eqs. (5.6.17) and (5.6.18) and this immediately leads to Eq. (5.6.10). 
5.6.10 L: Order of Tangency. 
Let us show that if m is chosen so that it verifies Eq. (5.6.49) as well as 
|n(x)| <Cla|?, Vx e [-6,6], (5.6.70) 
then it is also true that 
\m(x)| < Cle|?, Y xe [=s ð], VtE Ry. (5.6.71) 
Hence, for x € |—6, ô], [ro (£)| < C|x|?, implying Eq. (5.6.11) for k = 0. 
Suppose that m verifies Eqs. (5.6.49) and (5.6.70), e.g., t(x) = 20|2|3. 
From Eqs. (5.6.33), (5.6.25),(5.6.20) and S{®® (0,0) = (0,0), it follows 
t 
|ne(x)| < e~”*C| xo]? +f M |S&°) (x0, t(a0))|?dr 
0 


< e7’'C|ao|? + Mt(1 + 2Mt(a, + 5))? (x2 + m(20)?) (5.6.72) 
< Clao|? (e7 tC + Mt(1 + 2Mt(ay. + 8))? (C7! Jao] + Clzol?)) 
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for t € [0, t+], a E€ [-a4, a4]. 
From Eqs. (5.6.68),(5.6.25),(5.6.20) and 5(®® (0,0) = (0,0) 


t 
|zo| <|2| + M | (a4 [SS (x, me(x))1| + |S (x, me(a)) 7) dr 
3 (5.6.73) 


<le] + Mt{a+[(1 + Mt(ay + 8)) (l| + |m(2))I] 
+ [2(1 + 2Mt(a4 + 8))? (l|? + |me(x)|?)]} 


< |z| (1 + Mt{as[(1 + Mt(ay + 6))] + [21 + 2Mt(a, + 6))9]}) 


a Ime(x)|Mt{ a, ((1 + Mt(a, + 6))] + [2(1 + 2Mt(a, + 6))Pal}. 
To simplify the notations, rewrite Eqs. (5.6.72) and (5.6.73) by observing that 


if a,,6,t, < 1 (as supposed since the beginning of the analysis), there exists 
M’ > 0 such that 


\m4(x)| < Cle]? (1 — mi + M’V5t) (5.6.72') 


lxo] < |x| (1 + M’ (ay + d)t) + |re(a)| (1 + M"(ay + ô)t) (5.6.73’) 
Then, taking the 2 power of Eq. (5.6.72’) and using and (5.6.73’) 


reoi m + M'V5t) (1+ M'(aq + ô)t)(læ| + |me(a)|), (5.6.74) 


Im: (a) 
Since ô < 1, |m(x)| < ô, using |m;(x)| < |r:(x)? deduce from Eq. (5.6.74) 
2 


(1— 2t + M'VÕt)š (1+ M' (a4 + ô)t) 


ē <C? 2 
[m (x)|5 < |z| 1— M' (a4 + 6)t(1— 44 + M’V6t)3 


(5.6.75) 


Hence, let us choose y, a+, to to be so small that, Vt € [0, to], the ratio in 
Eq. (5.6.75) is bounded by 


t 
ratio in Eq. (5.6.75) < 1— T (5.6.76) 


we see that Eqs. (5.6.75) and (5.6.76) imply, Yæ € [—6, 6], Yt € [0, t4], 


In(a)| < Clelia - 2) < lat, (5.6.77) 


hence, the inequality between the left-hand and the right-hand sides holds, 
Vt > 0, and this implies Eq. (5.6.11) for k = 0. 
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5.6.11 M: Regularity in a. 


This is the last property to check. One proceeds almost exactly in the same 
way as above. Details will be illustrated because in some sense there is here 
a technical idea, new with respect to the ones already met. Actually we shall 
prove that 7 is a Lipshitzian function of a and z for a € [—a+, a+], x € [—ô, ô], 
i.e., a somewhat stronger result. 

Consider a function (x, a) — T(x, a) defined for x € [—6, ô], a € [-a+, a4], 
of class C and verifying Eq. (5.6.49) for each a. Define m(x, œ) by thinking 
of 7 as a function of x for each a and proceeding as in subsection 5.6.E. From 
Eqs. (5.6.33), (5.6.24), (5.6.21), (5.6.20), (5.6.49) and employing the usual 
notations, one finds that 


t 
mi(ar,a) = e"'x( 0,0) | e~ %0lt=7) 75(5(9) (x9, (a9, @)), a) dr, (5.6.78) 
0 


hence, recalling that xo is also a dependent and denoting ôx Ba: 


my 
|e 


t 
[ðar (x, a)| < 7" Oqm (x0, a) + | +f a| 


Zs 
Ou; 


< e= ðan(zo, a)| +e" o VA 2 


< (86%) (ao, (xo, aE {8%(00, (20, @))i 


evolt-7) 


t 
+ amò | arf dr{ Mr(6+6°t) + (1+ Mr(az +6)) 
x (2 Oxo fay a) Oxo eae a) )} (5.6.79) 
Oxo Oa ða 
e°" Aq (ao, a)| + emote VE] + M?6?(1 + 6t)t? 


oro On (Xo, a) I) 


+ Oa 
| + M6?(1 + 6t)t? 
Oxo) 


4+2M5(1+ Mr(az +6))¢((1+ CV8)|5— 
e-t2M (1+ Mr(az +4))t fer aj 

+ {et OVS + 2M5(1 + Mtla, + 5)t(1 + cv 
The 0,20 is estimated as in subsection 5.6.G, by Eq. (5.6.58) rewritten as 
Hoe gees if * x4(S (0, m(a,0), adr, (5.6.80) 


hence, proceeding as in the derivation of Eq. (5.6.79) and using Eq. (5.6.23): 
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2] 35 dr{ Mô + 2M (a+ + ô) 
[möra +6r) + (1+ Mr(a, + aeee] 


2 
<M + 2M5- (1+)(a4 +ô) 
Ont (ax, a) 


+t(1 + Mt(a, + 6))2M (a+ + 6)| Aa |. 
Then Eqs. (5.6.79) and (5.6.81) imply 


A+ Blda7(z0, a)| 
|Oame(x, a)| < er (5.6.82) 
with 
GS je tOV + 2Mô(1 + Mt(ay +4))t(1 + CVO), 
S! tjl + Mt(a4d)2M (as + 6)], 


A“?! M?6(1 + St)t? + GIM + M26t?(1 + 6t) (a4 + ô), 


B% eot 4 2M5(1 + Mtla, +6))t 


and to understand the essential features of Eq. (5.6.82), we note that if ô, a+, to 
to are chosen so small that there is such that the first term in Eq. (5.6.82) 
can be bounded by 


A = 
—— < .0. 
r gz < Mover (5.6.83) 


for all t € [0, to], Va € [-a+,a+], and the coefficient of |3ar(xo, a)| in Eq. 
(5.6.82) can be bounded as 


B Vot 
= ee .6.84 
1- GS 7 2°? Oe) 


then Eq. (5.6.82) can be simply rewritten, Vt € [0, t+], Va € [—a4, a+], 


— t 
ðar] < max Oom(x,a)| <MS%t+(1-)|jdan|| (6.6.85) 


lalSa4 


Now fix 7 to be a function of the variable x only and verifying Eq. (5.6.49). 
Apply Eq. (5.6.85) to the functions Tnt, 7(n—1)t,--. thought of as functions of 
x and a. If t € (0,t+], n = 0,1,2,..., 


23 Vot 
||Qatnel| < Mat + A- lla Tnll: (5.6.86) 
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Then, Eq. (5.6.86) implies, recursively, 


oe t t 
||Qatne|| < Mta +- >) HTS Zy Pac) (5.6.87) 


because 7 = 7 is by hypothesis œ independent, so that 0.70 = 0, i.e., 


3 
Mst _ 257 
OaTnt|| < — = 4 5.6.8 
Patil <= a (5.6.88) 
The regularity of 73. can now be checked: 
|\Too(@,@) — Too (x’, a”)| = lim |rni (x, a) — Ti (x’, a| 
pins 0 Bie (5.6.89) 
< lim (|e — 2'| +a — a'l) max (|F™| + |S") 


where the maximum is taken on the set [—6, 6] x [—a;, a+] and, by Eq. (5.6.49) 
(considered for m+) and Eq. (5.6.88), it can be estimated by D = V6(1 + 
2Mv, +). Hence, 

|\Too (#7, @) — Too (x’, a')| < D(|x — a"| + la — a’), (5.6.90) 


showing that 7, is continuous in x and a (i.e., it is in class C) and, actually, 
that it is a Lipshitz function in x and a (with a Lipshitz constant D which 
can be taken as small as desired by taking 6 small enough). 


5.6.12 N: General Case. 


To show that mæ is k-times differentiable with respect to x if a,,6 are 


. . 2 . 
chosen sufficiently small, one proceeds to estimate oy and, successively, 


3 k+1 . A x 
or, ties ot in the same way as in the k = 0 case we studied m; and 


One to show that Tœ was C0, assuming now that m is in C+) ({—6, 5]) to 
start with. 

Proceeding with the same technique as in subsections 5.6.F, 5.6.G, and 
5.6.L, (1 + k)ô, (1+ k)a+, to are chosen sufficiently small so that inequalities 
similar to Eqs. (5.6.41), (5.6.42), (5.6.43), and (5.6.53), etc. hold. One finds 


m(x) Vot Ər 
= (EAN F a 
a a ASU 


LV > 
© 


t — .6.91 
IRO gD 6621) 


for h =0,...,k+1 and t € [0,t,] with ¢, suitably small provided y > Rg, s(y) 
is a suitable continuous function in the variables 6, y and monotonically in- 
creasing in y. 

Equation (5.6.91) has the same nature as Eq. (5.6.86), and in the same 
manner it allows us to show inductively that the Eq. (5.6.49) as well as 


|Z || < 0, 7 =0,...,4 +1, imply 
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k+1 


j=0 


Equation (5.6.92) means that Ta is k-times continuously differentiable. 

Along similar lines, it is possible to prove the C) regularity in the variable 
a and, jointly, in a and x for |a|, |x| small. By way of estimates of the (k + 1)- 
th derivative of 7, with respect to a, of the k-th derivative of m; with respect 
to a, ..., and of the first derivative with respect to a of orn , this regularity 
property is proved following the ideas and the techniques of subsections 5.6.M 
and 5.6.N. 

The reader who has been determined enough to reach this point shall not 
have problems in transforming the above last hints into a proof. We only 
stress that from what has been said above, it appears that in order to obtain 
C*) regularity, one must impose restrictions on 6,a,, and tọ which are k 
dependent. This means that the above proof cannot be used to prove that 
the attractive manifold depends in a C% way on x and a: actually, it is an 
open problem to find whether such a smoothness property can be enjoyed by 
the attractive manifolds under simple extra assumptions (whose necessity is 
made clear by the example in Observation (1), p.412.) mbe 


5.6.13 Exercises 


1. Show vague attractivity of 0 = (0,0) near ac = 0 for ¢ = ag — z3, ż = —z, (a,z) E€ R?. 


2. In the context of Problem 1, show that the plane z = 0 is an attractive manifold in the 
sense of Proposition 11. 


3. Consider the equation in Problem 1 and the surface oq built with three pieces with 
respective parametric equations 


z(y) =Ze77 i ee 
ny) = 79) = Vat Sentary-$ > 7EM, 
2(y) =2’e77 
es =7' (y) = -va (1 + Se 207) 2’ y € [0, +00) 
{ | L4? TE (0,400) 


Show that oq is an attractive manifold VZ%,Z’,Z,Z’ such that va < %,—Z’, a > 0, in the 
sense of Proposition 11. (Hint: Note that t > Z(t) is a solution of ¢ = ax — x? with initial 
datum 7.) 


4. Show that the attractive manifolds in Problem 3 are in Ct), at fixed a, if a is small 
enough (2ak < 1). Show that the equation in Problem 1 admits infinitely many attractive 
manifolds not C°° in x. Meditate on how general this non uniqueness mechanism is. 


5. Consider the equation « = ax, = —vz + x? and determine all the attractive manifolds 
of the origin for 0 < —a < v. Show that for each a < 0 there are infinitely many such 
manifolds but only one, at most, can be of class C°°. Find a value of a < 0 for which no 
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attractive manifold is of class C1. (Hint: Note that an attractive manifold must be a union 
of trajectories of solutions of the differential equation; see also Problem 1. The critical value 
of a is a = —v/2.) 


6. Using the example of Problem 5 show that the assumption Re A;(ac) = 0, j =1,...,7, 
is essential in Proposition 11. If this assumption is not verified argue that a proposition like 
Proposition 11 could still hold if the order k of smoothness is restricted as k < vo/%, at 
least. See also Problem 7. 


7. Prove Proposition 11 for Eq. (5.6.12) when a is near some ac, —vo < Ac < 0 and k = 0. 
(Hint: Write Eq. (5.6.17) as & = aca + x5(a, 2)((a — ac)x + P(x, 2)) and proceed as in 
the proof in 85.6 with the obvious substitution of Eq. (5.6.32), and of the other equations 
similar to it, with x = e®¢'xo + Se e%clt-7) Xs (sho) (xo, T(£0)), a) dr, etc.) 


8. Show the validity of Proposition 11 in the case in which Eq. (5.6.12) is replaced by the 
equation (u > 0) 


tı =ax, — pre + Pı (x1, 22, 2), 
t2 =a, + ax2 + Po(#1, 22,2), 
ż = — voz + Q(«1, 22,2), 


(Hint: Write the equation analogous to Eq. (5.6.17) as 


tı = — pee + Xg (£1, £2, Z)(azı + Pı (x1, %2,2)), 
£2 =px1 + Xs(£1, £2, Z)(ax2 + Po(x1, %2, z)), 
ż = — voz + Xö(£1, £2, Z) Q(£1, £2, 2), 


with analogous notations. Then proceed exactly as in the proof in §5.6, substituting Eq. 
(5.6.32), and the other equations similar to it, with 


x = W(t)xo + [ W(t —7)X5(S6° (xo, 2(x0)), a) dr, 


cost —sint 


sint cost 


where W(t) = ( ) is the Wronskian matrix; see, also, problems for §2.5, etc.) 
9. Using the same ideas as in Problems 7 and 8, study Proposition 11 in the general case, 
i.e., for an equation of the form of Eq. (5.5.10). 


10. If xo is not supposed to be vaguely attractive, recognize that the proof of Proposition 
11 can be interpreted as showing the existence of a surface oq defined as in Eq. (5.6.4), 
verifying Eq. (5.6.11) and 

(i) If w € I (0) Noa and sw E T(4ô), Vr € [0,t], then sw € a (“local invari- 
ance”). 

(ii?) If Sew € (5o) N ca, Yt > 0, then d(S{ w, oa) 15 
statements of hold as long as the point stays inside r(46). (Hint: Vague attractivity is used 


2 
only to reduce the proof to a theory of (5.6.17), (5.6.18). So just start from them.) 


0 exponentially fast, i.e. the 


11. Consider the equation x = f(x) and suppose that xo = 0 is a stationary solution for 
it. Let L be the stability matrix of xo and suppose that L has (d — r) eigenvalues with 
negative real parts and r with zero real parts. Without imposing the vague attractivity of 
0, interpret the proof of Proposition 11 with a = 0 as showing that given k > 0, C > 0, there 
exists ô and ĝo, ô > ĝo, and a surface ø of dimension r and described by (d — r) functions 
(pt), ..., ep) of r variables z1,..., £r, with |x;| < 46 and verifying Eq. (5.6.11) as well 
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as the local invariance and attractivity properties of the preceding problem (“theorem of 
the central manifold” ). 


12. Consider the equation « = Ax + P(a,z), 2 = —vz + Q(a,z) with A,v > 0 and P,Q € 
C™(R?) with a second order zero at the origin. Show that 


St (a, z) = (ea + tD(a, z,t), ze7”! + tE(a, z, t)) 
with D and E of class C% and having a zero of second order at (0,0) in the variables (x, z). 


13. Use Problem 12 to show that, in the same context and for all small ô, if m is a cH) 
function on [—6, ô] such that 
d 
moss EE] < v3 () 
dx 
and if a(r) denotes the curve z = m(x), x € [—6, ô], then S;o(z) is such that Syo(a)NI'(6) = 
o(mz). and m+, verifies Eq. (*). (Hint: Use the ideas of the proof of Proposition 11.) 


14. In the context of Problems 12 and 13, show that 


Won 4ayt — Tall < Elz — M(n—aell> Irz = ral < Ellr — a'll 


with € < 1 (if ||- || denotes the maximum of a function) provided 60 is small enough. 
Deduce the consequent existence in I°(4) of a surface locally invariant for S; and tangent 


to the x axis at the origin and such that Siw += 0 exponentially fast in the sense 


—XA = limt—-+oo + log |S_+tw| for all nonzero w on the surface. Denote this surface by oj: it 
is called the “unstable manifold” through 0. 


15. In the context of Problem 12, show the existence in I’ (ô) of a surface gs locally invariant 
for S+, tangent to the z axis, and such that Vw 4 0, w € ag it is —v = limt—++o0 + log |S¢-w| 
(“stable manifold through 0”). 


16. Study the generalization of the result of Problems 10-13 to a general equation in R4, 
x = f(x), with f(0) = O and a stability matrix L whose eigenvalues are pairwise distinct 
and such that none among them has a zero real part, although some of them have a positive 
real part and others have a negative real part (“hyperbolic unstable point”) (“existence of 
stable and unstable manifolds at a hyperbolic fixed point”). 


17. Consider the equation « = x 4 = H Bes zt z H z2 a=6=1,8 7 = 2, 


and compute the second derivative at the origin of the function 7 , defining (via z = 75(z)) 
the stable manifold of 0. (Hint: Write x = Az? + Bz? +... and insert this expression in the 
first equation. One finds A = 871.) 


18. Find some extensions to Problems 14 and 15 to equations in R and study them. 


19. In the context of Proposition 11, show that if ida is regarded as an attractor for the 
neighborhood U used in the proof [see Eq. (5.6.15)], and if Gy = Nes 0S! oa then the 
function in the left-hand side of Eq. (5.3.21), p.380, with A = Gq can be estimated by an 
exponentially decreasing function of t as t + +00, i.e., a is a normal attractor for U by 
Proposition 5, §5.3, p.379. (Hint: Examine the text of Proposition 11 and the discussion 
around Eq. (5.6.15).) 
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5.7 An Application: Bifurcations of the Vaguely 
Attractive Stationary Points into Periodic Orbits. The 
Hopf Theorem 


After the considerations of §5.5 Proposition 10, p.405, the theory of §5.3 can 
be immediately applied to Eq. (5.1.19). Fixed k, k > 2, there is B > 0 anda 
cubic neighborhood (6) centered at © with side 26, and a family oa of C) 
surfaces in T (ô) with equations 


w3 = 3 + Palwı, we), (5.7.1) 


defined for |wy|,|w2| < $5 and a close to ac, a € (a — a+, Qc + a4), and 


|Pa(w1,w2)| < B (w? + w3) (5.7.2) 


Pa € C ([- 48, 46]? x (ac — a4, ac + a4)), and for every a close to a, the 
surface Ca is invariant in the sense of Eq. (5.6.9) and attractive for all the 
points of (6) in the sense of Eq. (5.6.10), with exponential strength. 

It will be shown that if (a — ac) > 0 is sufficiently small, there is on ca 
a minimal attractor Ag consisting of a periodic orbit with a period approxi- 
mately 27/3 and attracting the points on oq /{@} with exponential strength. 

Essentially, by using Proposition 5, §5.3, it will then follow that, in the 
situation of the preceding sentence, A, U {@} is an attractor for which the 
basin [’(6) is normal and Vw € I'(d), da(w) E€ Aa U {@} such that 


[3k (w) — S (nw))| aaa 0 (5.7.3) 


t—-+0o 
exponentially fast. This statement “completes” the analysis of the asymptotic 
behavior of the motions of Eq. (5.1.19) with initial datum w close enough to 
© and with a given a slightly above a,.! 

To see which is the real motion of the gyroscope corresponding to this 
asymptotically periodic motion of its angular velocity, it would still be neces- 
sary to integrate the “geometric” differential equations connecting the Euler 
angles with the angular velocity, see Eqs. (5.2.9)-(5.2.11). We shall not discuss 
this last point. 

The preceding statements follow, as a special case, from the following 
general “Hopf bifurcation theorem” and from the observations to it. 


12 Proposition. Consider a differential equation x = f(x, a) in R?, parame- 
terized by aœ € (—a,a) and having the origin O as a vaguely attractive station- 
ary solution near ae = 0. Suppose that the stability matrix of the origin, de- 
noted L(a), has eigenvalues A(a) = at+ip(a), (a) = a-ips(a), ge u(0) A 0. 
Also suppose that the equation is already put in normal form with respect to 


15 An even more complete picture, distinguishing the points attracted by Aq from those 
attracted by @ can be obtained by using the results of Problems 12-19 of 85.6. The 
outcome would be the one described just before Proposition 11, p.411. 
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à, À (see Definition 6, p.392, §5.5; this can always be achieved via a change 
of coordinates, by Proposition 7, p.393, §5.5): 


t =ax — p(a)y + P(x, y, a) 
ý =u(a)x + ay + Q(x, y, a) 


with P,Q € C (R? x (—a,a)), k being a large enough integer, and with P,Q 
having a third-order zero in x = y = 0, Va € (—a, a). 

Finally suppose that the origin is vaguely attractive because the vague attrac- 
tivity indicator Ya, is negative. Recall that Ya, is defined as the average value 
over 0 of Ya(0) with 


(5.7.4) 


woela P(x,y, o zs yole, y, a) 
o>0 (x? +y?) 
if (0,0) are the polar coordinates of (x,y), see (5.5.25). 
Then if a > 0 is sufficiently small, there is a periodic solution to Eq. (5.7.4) 
which is an attractor attracting all the points in a small neighborhood of O, 
with the exception of O itself, with exponential strength. 


The period Ta of this motion is such that limg+a, Ta = MOR 


(5.7.5) 


Observations. 

(1) The requirement on k to be large enough is imposed to guarantee the 
possibility of further reducing the complexity of Eq. (5.7.4) by changing coor- 
dinates so that the function ya(0) in Eq. (5.7.5) becomes 0 independent (i.e. 
Ya(0) = ¥,) in the new polar coordinates and, at the same time, so that in 
the new coordinates the functions 


r(a,y, a) =x P(x, y,a) +y Q(2, y, a) — Ta (2? +7)’, 
8(x,y, a) =r Q(x, y, a) E y P(x, y, a) 
are infinitesimal of fifth order at x = y = 0, uniformly in a € (—a,a), and 
also have gradients in x,y infinitesimal of the fourth order [a property used 


below in Eqs. (5.7.17) and (5.7.18)].1° See Observation (8) for more details. 
(2) In the application to Eq. (5.1.19), Eq. (5.7.4) is 


(5.7.6) 


Wy =(a Qe)Wy G3 W2 P(w, W2, a), (5 7 7) 


w =03 w1 + (a = Qe)w2 T Q(w1, w2, a), 


where P(w1, w2, a), Q(w1, w2, a) are respectively 


= w (Aw? + A$wWE + AY 2O3pa (w1, we) + NF Ya (we, w2)*) — w2Ya lwi, we), 


= wo(Agwy + A$ws + NY 23a (wi, we) + AF Pa(we, w2)*) + wi Pa(wr, w2), 


(5.7.8) 


16 k > 5 will suffice, see Observation (8). 
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Ya being the function defining the attractive manifold, and it is of class C™), 
k chosen (once and for all) as large as desired. Hence, 


O= lim -X + Mus + N Dapa lwn we) 
a = a eS OS ee 


5.7.9 
w1,w2—>0 (w01? + w2) ( ) 


and to evaluate Ją, one does not need to know explicitly Ya. One can pro- 
ceed as in the proof of Proposition 10, p.405, setting r = Ya; by the same 
calculation one finds 
1 MID, 
Ais are eek (5.7.10) 
2(A;i + 3A5 3) 

Hence, Eq. (5.1.19) has a periodic attractive solution for a > a, and (a — ae) 
small. 
(3) As already noted, the assumption that the equation x = f(x, a) has nor- 
mal form with respect to (a), A(a)) is not really restrictive if (0) 4 0, by 
Proposition 7, §5.5, p. 393. 
The assumption Re A(a) = a is also not too restrictive: if Re Xo) # 0 we 
can rename +Re A(a@) with the name a and fall within the assumptions of the 
theorem. However, pathologies can appear if Re \(a@) has a vanishing deriva- 
tive at Qc. 
(4) The theorem has been formulated in class C“) rather than in class C% 
because it is usually applied in connection with the attractive manifold theo- 
rem, Proposition 11, p.411 [as, for instance, in Observation (2)], in which case 
one cannot take k = +00, in general. 
(5) It is important to stress the rather general situation that the above the- 
orem can cover, if combined with the attractive manifold theorem of 85.6, 
and with the normal-form theorem (Proposition 7, p.393, §5.5) when the loss 
of stability takes place in two non real conjugate directions. One just has to 
perform the changes of variables (possible if Reala) Æ 0, u(0) Æ 0) casting 
the first two equations, among the d equations of the transformed system, 
into normal form with respect to the two eigenvalues \(q@), A(a) “responsible 
for the loss of stability”, as 


tı =Q% — ula)z2 T P(21,22,y, a), (5 7 11) 


t2 =u(a)zı + arg + Qla, £2,Y, 0), 


where y denotes the remaining (d — 2) unknowns of the differential equation. 

Then one considers the differential equation in R? of the form of Eq. 
(5.7.4) with P(x1,22,a@) = P(a1,%2,0,a@),Q(x1,2%2,a@) = Q(x1,22,0,a). If 
this equation verifies the assumptions of Proposition 12, we can infer that the 
original equation has an attractive periodic orbit for a slightly above ae. 
The proof of this simple criterion is obtained by the obvious extension to RI 
of the discussion in Observation (2) (write y = Pa(z1, £2) and use the fact 
that Ya vanishes to second order. 
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(6) The above theorem has a natural analogue in one dimension. Consider the 
equation in R: 


t = ax + plz, a), (5.7.12) 
where p € C)(R?), k large enough, and p has a third-order zero in x = 0, 
Va € (—a,a), and 


p(x, a) 


eee (5.7.13) 


c(a) = lim Z 
x—0 

with c(a) < 0 and continuous near a = 0. If k is large, by the implicit 
function theorem, Eq. (5.7.12) has two stationary solutions, for a > 0 and 
small (x ~ O 

At such points, the stability “matrix” is —2a < 0 and, therefore, the two 
points are attractors with exponential strength for the points in their vicinity. 

This observation is sometimes useful in treating cases analogous to the 
ones discussed in Observation (5), when the stationary solution loses stability 
because only one real eigenvalue crosses the imaginary axis, as a grows through 
a critical value a, leaving the stationary solution vaguely attractive. 

However, it should be stressed that this is a rather rare possibility since it 
is generally impossible to put a one-dimensional equation into normal form, 
see observation (3), p.397. The existence of normal form can be expected only 
in systems with “some symmetry” (like 2 — x odd symmetry of p(x, a)). 

Note also that if Eq. (5.7.12) has the property of Eq. (5.7.13), then a small 
perturbation of it, like 


&= az + p(x, a) +e2?, (5.7.14) 


can change the vague-attractivity character of x = 0 for a near 0, no matter 
how small € is (exercise). This phenomenon is not possible in equations in 
which the loss of stability takes place in two complex non real directions 
(essentially just because of the existence of normal forms). 

(7) The mechanism of generation of a periodic orbit out of a fixed point when 
a grows through a, described in Proposition 12, is called a “Hopf bifurcation” . 
The solution xg loses stability in two complex directions at œ = a, and, if 
it stays vaguely attractive in the sense of Eq. (5.7.5), it is surrounded by a 
periodic attractive motion taking place on a curve whose diameter, as we shall 
see, grows as \/a — Qe for a— a, > 0 and small. 

(8) As shown in the proof of Proposition 8, p.396, it is always possible to 
change smoothly coordinates so as to put Eq. (5.7.4) into a form such that 
7,(@) is 0 independent: y.(@) = Fa, Va E (—a, a), i.e., 


i =ax — ulay + Farle? +y? + P(x, ,Q), 
pla)y + Fox(x* +y") + P(x, y, a) (5.7.15) 


y =u(a)x + ay + Foy (x? Tv y’) + Q(x, y, a), 
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with yọ < 0 and with P, Q infinitesimal of fourth order at x = y = 0, uniformly 
in a € (—a,a) (possibly reducing the value of a); see the change of variables 
of Eq. (5.5.39) changing Eq. (5.5.38) (ie., essentially, Eq. (5.7.4) written in 
complex form) into Eq. (5.5.43) (ie., (5.5.37)). By Eqs. (5.5.42) and (5.5.38), 
it one realizes that the needed change of coordinates involves the third-order 
Taylor coefficients of P and Q at x = y = 0,a = Qe, with respect to the 
variables x,y and it turns out to be of class C™ in the variables x,y near 
x = y = 0 and a small (but, in general, only of class C(*—%) in a). 

If k > 5, the functions P,Q in Eq. (5.7.15) have fifth-order derivatives with 
respect to x, y continuous in x,y, a near (0,0,0) and also have a fourth-order 
zero in x,y at x = y = 0, Va (—a,a), if a is small. 

Furthermore, the functions r,s of Eq. (5.7.6) are now equal to 


r(x, Y, a) =g% P(e, y, a) + y Q(z, Y, a), 


a K (5.7.16) 
s(x, y, a) =r Q(z, y, a) 4 y P(x, y, a), 


by Eq. (5.7.15), and their derivatives in x,y are continuous in x,y,a@ near 
(0, 0,0) and have a fourth-order zero at x = y = 0, Va € (—a, a). 

Hence, to fix the ideas, we shall suppose that “k large enough” means k > 5. 
However, this is not optimal, and one can improve the value of the degree of 
regularity in x,y, œ necessary for P,Q so that a proposition like Proposition 
12 will hold in general. To obtain fine results, one should distinguish the 
regularity imposed on the a variable and that on the x, y variables. 


PROOF. By observation (8), if k > 5, it suffices to treat Eq. (5.7.15) with 
P,Q, 0r,0s [see Eqs. (5.7.15) and (5.7.16)] being fourth-order infinitesimals 
in x,y for x = y = 0, uniformly in a € (—a,a) (here ô denotes the gradient 
with respect to the x, y variables). 

Let ¥ = —7,2 = u(0). By the infinitesimality properties of P,Q, it is 
possible to find g > 0,0 < T < a, such that, for all (x,y) € C(@)/{0}, with 


C(0) aer {x,y | (£,y) € R?, y£? +y? <D}, and for all a € (—a,@) 


l 2 3_ 
a— ge <0, 3h <H(0) < 5h, 
pet MORO a eee ee Sn 
TS Ta (a2 + y2)2 2” (72 +y2)2 ~ 2 


having supposed, for definiteness, that F > 0. Call C(o’, o”) g {annulus with 
radii o < 0” = {x,y | (x,y) € RÈ, d < V2? + y2 < o"} 

We now check that Eq. (5.7.17) implies that the disk C (g) is go invariant 
and that there is also an invariant annulus C (0%, o!) C C(@) with 0 < o4, o% < 
@ which is an attractor for the points in C(@)/{O}, for all a € (—G,@). 

In fact, multiply the first of Eqs. (5.7.15) by x, the second by y, and add 
the results: 
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d £? +y? 


r(x, Y, Q 
IEH H(t Fala? ty?) + DEO) a? + 92) 


(a? + y?)? 


={< (a — FÉR) (2? +y?) 
> (a — F(z? + y?)) (a? + y?) 


(5.7.18) 


which shows [see the first of Eqs. (5.7.17)] that the intermediate term in 
Eq. (5.7.18) is negative on OC(@). This means that C(@) is sf) invariant, 
Vt > 0, Va € (-G,@). Let 


7 
m=, “alt (5.7.19) 
27 7 


and note that the inequalities in Eq. (5.7.18) show that the intermediate term 
in Eq. (5.7.18) is positive on OC(@,) and negative on C (0); hence, the 
annulus C (g4, 0%) is si) invariant, if a is small so that 0% < D. 

Equations (5.7.17) and (5.7.18) also show that if of, = 404, 0% = 20) < U 
the annulus C(/,, 0!) is also invariant and enjoys the property that any initial 
datum chosen in C (0)/{0} evolves, entering into C (0%, 0/7) in a finite time (see 
Fig. 5.7), Va € (-G,@). 


Figure 5.7: Initial data in C(@) enter in a finite time the shaded annulus C (0%, 0%). 


In fact, if 0 > yx? + y? > off, the first inequality in the right-hand side 
of Eq. (5.7.18) shows that the intermediate term of Eq. (5.7.18) is < —&, 
so that the “entrance time” in C(/,, 0%) is finite and can be estimated by 


If 0 < J= yx? +y? < df, the intermediate term of Eq. (5.6.18) is not less 
than m = ming’, > o > (ao? — 270*) > 0 by the second inequality in the 
right-hand side of Eq. (5.6.18). Hence, the entrance time can now be estimated 
by 7 = (eh? — &)/2m. 

This means that every datum close to the origin moves away from the 
origin until it enters the annulus C (o, 0%) in a finite time, while every datum 
close to OC(G) moves towards the origin until it enters the annulus C (0%, 0%) 
in a finite time. These motions are spiraling motions, as we now show. 

To see that the motions starting in C(@)/{0} are “spiraling motions”, it 
suffices to study them in polar coordinates. 
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If Sk” (x,y) = (x(t), y(t)) and if (o(t), A(t)) are the polar coordinates of 
(x(t), y(t)) € C(Q)/{O}, 


dð d y(t)  ýx— ty 

dt dt ETE) gp y?? 

do d VORTOL tx + yy 

E =L SrO + yO = ——— 

gO TP = Te, 
Note that if o(0) > 0, e(0) < J, then e(t) > 0 and o(t) < @ for all t > 0, 
because of the above arguments. Hence, Eq. (5.7.15) and the second and 
fourth inequalities in (5.7.17) imply 


(5.7.20) 


; s a 1 : 
6 = pla) + ee => Gn<6< 2%, (5.7.21) 
i.e., 0 is monotonic in t and diverges as t > +00. This just means that the 
motion spirals if 0 < o(0) < J. 

We now check that the spirals associated with the initial data external to 
C(of,), but in C(@), become asymptotically confused, as t > +00, with those 
associated with data internal to C(/,), but different from the origin. 

If this happens, the two families of spirals are separated by a periodic orbit 
which will be an attractor with basin containing C (0)/{0}. 

To discuss the asymptotic identity of the spirals it is convenient to describe 
them as geometric objects, thinking of them as parameterized in terms of 6 
instead of t, which is possible by Eq. (5.7.21). 

Let 0 — 01(8) and 0 > o2(0) be the equations in polar coordinates of two 
spirals on which two motions of Eq. (5.7.15) run, starting with initial data 
01(0) > oa, 41(0) = 0 and o2(0) < 0%, 02(0) =0 and @1(0) < @2(0). 

By the uniqueness theorem for the solutions of the differential equations 
and by the autonomy of Eq. (5.7.15), we see that o2(0) — 01(0) > 0, YO > 0. 
We show the existence of R > 0,e(a) > 0 such that for œ small enough, 


02(9)01(0) < R e7? (5.7.22) 


Then the autonomy of Eq. (5.7.15) and Eqs. (5.7.22) and (5.7.21) plus the 
attractivity properties of C(o/,, 0%) will imply that every datum in C(@)/{0} 
evolves exponentially fast in 0 (with rate constant > e(a) > 471) towards a 
periodic trajectory of Eq. (5.7.15) which separates geometrically the “outer” 
spirals (i.e., those originating outside C(0%)) from the “inner” spirals (i.e., 
those originating inside C(@,)). 
To prove Eq. (5.7.22), note that Eqs. (5.7.20), (5.7.19), and (5.7.21) imply 
do _ a (2? +y’) +7o(a? +y’)? + r(x, ya) 
d O Moe +4?) + 8(@, 9,0) 
where r,s are infinitesimals of fifth order in x,y at x = y = 0, uniformly 
in a € (—@,@), while their gradients with respect to z and y have the same 
property to fourth order. 


(5.7.23) 
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Equation (5.7.23) will be rewritten as 


dlogo  a+%,0* +r(z,y,a)0~? 
dt ula) + s(x, y, a)o-? 


We now wish to show that the right-hand side of Eq. (5.7.24) is monotonic 
in o for o € [04, of] at fixed 0 and that its @ derivative stays away from zero. 

To estimate the derivative just compute it. Basically, the possibility of the 
bound is due to the fact that to the lowest order in o, the right-hand side of 
Eq. (5.7.24) is (a + 7a0°)/u(a) whose o-derivative is 27, 0/u(a). 

So we expect that if g is small enough [with @ chosen correspondingly 
small so that the first of Eqs. (5.7.17) still holds], the o derivative of the right- 
hand side of Eq. (5.7.24) can be estimated, V o € [o/,, 0%] (using the orders of 
infinitesimality of r,s, ðr, ðs neglect the terms in r,s) to be not larger than: 


(5.7.24) 


D Oa 


A direct calculation of the @ derivative of the right-hand side of Eq. (5.7.24) 
actually proves the above statement, by Eq. (5.7.25). 

Then recalling that o2(0) > 01(0), V0 > 0, and writing Eq. (5.7.24) for o2 
and o1, and subtracting them, we find, applying the bound on the derivative 
(5.7.25) (recalling that o/, < 01(6)): 


d 0 0 
Z og 2D < yya) - 01(6)) = -xvin 0 (2E - 1) 
02(8) a_/02(8) o 
< -xv ag —-1)=-— -1 
< -xvala a 1) =a Gey 2) 
which interpreted as a differential inequality for a yields 
01(9) 01(0)\ -g0 
1- <(1- e 2 5.7.27 
(0-30) <O- ao) (6.7.27) 
by integration, and this completes the proof. mbe 


5.7.1 Exercises and Problems 


1. The estimate for the coefficient e(a) in Eq. (5.7.22) is [see Eq. (5.7.27)], e(a) = Pe Is it 
possible to improve it so that the new estimate (a) has the property that €(a@) =p € > 0? 
If not, find a physical interpretation or a motivation of this fact. 


2. Consider the differential equation in R? written in complex form as 2 = €(a)z+ P(z,7), 
where z = x + iy, (x,y) E€ R?, (a) = o(a) + ip(a), and let o(0) = 0, (0) 4 0, o, p 

C™(R?); suppose P to be a C™® function of x, y with a second-order zero at the origin. In the 
proof of Proposition 8, p.396, it was shown [see the change of variables in Eq. (5.5.39)] that in 
some new coordinates the equation can be given the form 2 = €(a)z+c2z(a)z|z|? +O(|z|*), 
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where O(|z|*) symbolically denotes a function of x, y, œ of class C% and with a fourth-order 
zero at z = 0 for all a near zero. Show that the equation can be given the form: 
ž = (a)z + c2(a)zļz|? + +O((z|*) 


with the same meaning of the symbols, after a new change of coordinates. (Hint: Again 
change coordinates as ¢ = z + I4(z,Z), where I4 is a homogeneous polynomial in z,Z of 
fourth degree, such that the fourth-order terms in the equation cancel, see Eq. (5.5.39)- 
(5.5.43).) 


3. In the context of Problem 2, develop the same ideas to show that, Vk > 0, the equation 
can be put, in a suitable coordinate system, in the form 

& = €(a)z + co(a)z|2|? + caz|2|4 +... copz|z|?* + O([z/?#tY) 
(Hint: Use induction.) 


4. Show that in Problems 2 and 3, the assumption o(0) = 0 is not necessary. Actually, if 
a(0) Æ 0, show that, by the same type of arguments, the equation can be given the form 


2 = €(a)z + O(lzl*) 


for all k > 0. (Hint: Note that the reason why one could not eliminate c2z|z|? in Problem 


2 was that (0) + A(0) = 0.) 


5. In Problems 2-4, the parameter a does not play a very essential role. Formulate state- 
ments of the same type for a-independent equations. (Hint: Just set a = 0 in Problems 2-4 
and determine what can be said.) 

For information about the problems related to the iterated composition of coordinate trans- 
formations transforming the original equations into a fully linear equation z = éz when 
Re #0 and Imé £ 0 (by letting k — +00 in Problem 4), see [34]. 


6. Discuss the bifurcation pattern, as a grows, for the stationary solutions of the equation 


y= — 11+ 47273; 
42 = — 924+ 37173, 
1 =-= 571 - TNV +, 


7. Same as problem 6 for 


Jı = — 271 + 49273 + 47495, 


J2 = — 972 + 39198; 


Y3 = — 993 — T1972 +Q, 
Ja =— 54-7195; 
Ys = — Y5 — 31174- 


assuming (without checking it) that when a stationary solution loses stability in one real 
direction or in two complex ones, it remains vaguely attractive with a negative vague- 
attractivity indicator [as defined in Eqs. (5.5.24) and (5.5.25)]. See §5.8 for a more detailed 
analysis. 


8. Find some improvements on the regularity requirements in the variables x,y, and a in 
Proposition 12, possibly requiring a different order of regularity in x,y, or a. 
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9. In the context of Proposition 12, suppose that Y,,, as defined there, is positive. Show 
that in this case, if ac = 0, there is a repulsive periodic orbit for Eq. (5.7.4) for a < 0 small. 
(Hint: Just change t into —t and apply Proposition 12, noting that the change of t into —t 


changes the notion of attractivity into that of “repulsivity” .) 


5.8 On the Stability Theory for Periodic Orbits and 
More Complex Attractors (Introduction) 


Nondum matura est. 


In this section we devote some attention to what happens, as a increases, 
to the periodic solution of Eq. (5.1.19) whose existence has been established 
in §5.6 and §5.7. More generally, one can ask how to establish stability cri- 
teria for periodic solutions to differential equations, with uniformly bounded 
trajectories, of the type: 


x = f(x, a) (5.8.1) 


with f € C® (R4 x R) or £ € C) (Ri x R) with k large enough. 

Before examining the evolution of the stability of a periodic orbit of Eq. 
(5.8.1) when a varies, it is necessary to investigate the notions of stability of 
a periodic motion of the equation in R?: 


x = f(x) (5.8.2) 


with f € C® (R1) or C (RI) with k large enough and such that Eq. (5.8.2) 
has bounded trajectories. 

Let t — x(t), t > 0, be a periodic solution of Eq. (5.8.2) with minimal 
period T > 0. The stability and the attractivity of this solution is conveniently 
described in terms of the “Poincaré transformation” . 


7 Definition. Let t — x(t) be a periodic solution of Eq. (5.8.2) with minimal 
period T > 0. 

Let £o be a point on this trajectory, say Eo = x(0) € RÌ, and let o be a (d—1)- 
dimensional flat surface element cutting the orbit at the point to so that the 
orbit is not tangent to o in £o (“transversal surface element”). 

It is then possible to define a C® transformation [or a C™) transformation, 
if the right-hand side of Eq. (5.8.2) is only of class C®)], on a neighborhood 
of £o relative to o and with values on o itself, by considering a neighborhood 
U of £o ono so small that the motion, according to Eq. (5.8.2), of the initial 
datum € € U comes back to intersect o for the first time after a time Tg ~ T 
at a point ®,(§) € o. 

The map of o QNU into o associating with E € o QU the point ,(€) € o 
is called the “Poincaré transformation” relative to the given periodic orbit, to 
the given surface element, and to the given vicinity U. 
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It is then possible to formulate the following sufficient stability and attrac- 
tivity criterion (and instability criterion as well) for a periodic orbit. It is the 
best illustration of the meaning and of the interest of the Poincaré maps. 


13 Proposition. Let t > x(t), t > 0, be a periodic motion for Eq. (5.8.2) 
with minimal period T > 0. 

Let o be a transversal surface element to the trajectory in £o = x(O) and 
introduce on a Cartesian coordinates n = (m,.--,Ma—-1) with origin in £o. 
Denote by n = (n) the Poincaré map defined in a suitable neighborhood of 
ĉo on o. By definition it is 8, (0) = 0. 

Define the “stability” or “Lyapunov” matrix of the periodic orbit, relative to 
o and to the given system of coordinates on it, as 


ap 
ony 
Then the periodic orbit is stable and is an attractor, with exponential strength, 
for the points close enough to it if all the eigenvalues of the matrix Lo have 
modulus less than 1. 

If at least one among the eigenvalues of Le has modulus larger than 1, the 
orbit is unstable. 


Gios (0),  ij=1,...,d—1 (5.8.3) 


Observations. 

(1) This proposition is analogous to Proposition 6, p.382, §5.4, formulated for 
maps rather than for differential equations (which can, however, be thought 
of as “infinitesimal maps”). Its proof is left to the reader as an interesting 
problem [see also Observation (2) below]. To study it, one should first un- 
derstand the case when &, is a linear map near o. Proposition 13 bears the 
name “stability criterion of Lyapunov” for maps. 

(2) Proposition 13 is a special case of a slightly different proposition which 
could be formulated on the stability of stationary points with respect to the 
action of repeated applications of a map of RÅ into itself. 

The fact that &, is a Poincaré map plays little role in the proof which is, in 
fact, split into two parts: 

(i) show that the origin is an exponentially attracting (or, alternatively, un- 
stable) point for the iterates of s: 

(ii) remark that since &, is a Poincaré map relative to a periodic orbit for Eq. 
(5.8.2), (i) implies that the periodic orbit exponentially attracts the points 
close enough to it (or is, alternatively, unstable). 

And (ii) follows trivially from (i), which could be phrased without reference 
to the Poincaré map but simply for an arbitrary map of a surface into itself 
(with a fixed point). 


Now consider Eq. (5.8.1) and assume that, Va € (a’, a”) a J, this equa- 
tion admits among its solutions a periodic motion t > x,(t), t > 0, with 
minimal period Tą > 0 and such that the function (a,t) > xq(t) is a CO 


442 5 Stability Properties for Dissipative and Conservative Systems 


function on J x [0, +00), if C™) is the regularity class in the right-hand side 
of Eq. (5.8.1). 

It will then be possible to consider, Va € (a’,a’), the stability matrix 
L(a), see Eq. (5.8.3), relative to a surface element o which, if J = (a’, a”) 
is a small enough interval, can be supposed to be œ independent. 

We can choose the Cartesian coordinate system on ø for each a, with 
the origin at the point a at the intersection of o and the trajectory, and 
smoothly varying with a so that the Poincaré maps Ss a(n) are defined for 
n € U, where U is a small enough neighborhood of the origin, and o aln) is 
of class C) on U x (a', a") in the variables (n, œ) and 


®z,.(0)=0, Vac. (5.8.4) 


We can and shall suppose that Poa is extended arbitrarily to a map of 
RÄ! into itself, having the same regularity class C) (to define this extension 
it might be first necessary to reduce slightly the size of U). 

In analogy with the definitions of stability, attractivity, etc. relative to 
the solution flows associated with differential equations, we can introduce 
analogous notions for a single transformation ® of R4, or of an open subset of 
RÅ, into itself. What was formerly the family (St)t>0 of maps associated with 
the solution of the differential equation now becomes the family (®")nez, of 
the iterations of 9, i.e., one can think of as an “evolution” on R? observed 
at integer times. 

We do not repeat the obvious process of setting up the notions of stability, 
attractivity, vague attractivity, etc. for the iterations of a map ®, and we just 
mention that once such definitions are posed in an obvious manner (taking into 
account the analogous definitions associated with the differential equations), 
the following proposition on the existence of an attractive manifold and on 
the Hopf bifurcations holds. 


14 Proposition. (i) Consider Eq. (5.8.1) with £ € C’+), k > 1, and sup- 
pose that the equation admits a family of periodic orbits verifying the properties 
illustrated in the above teat, following the observations to Proposition 18. 


Suppose that for a € ge (a’, a’), the stability matriz Lola) has the eigen- 
values rs41(@),...,Aa-1(@) with modulus less or equal tov < 1 and that, 
for some v' € (v,1), the other eigenvalues r1(a@),...,As(a@) have modulus 
larger or equal to v’. Also suppose that the plane generated by the eigenvec- 
tors of Lz(a) associated with the eigenvalues \1(a),...,As(@) coincides with 
the plane Ns+1 =... = Na-1 = 0. 

If the origin is vaguely attractive for the maps Boo near a. E J, and if 
Aj(@-)| =1, j =1,...,8, there eriste > 0,5,60 > 0,60 < ô and d—1—s func- 
tions p*),...,p(4-) defined in the neighborhood!” T;(3) x (Me — €,A +€) 
and there of class C) such that the equations 


17 As usual, I's(5) = {x|x € RS, |ai| <6,i=1,...,5. 
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Ns+j = pt (m,...,na-1,0), j=l,...d-1 (5.8.5) 

define inI'y_1(46) a family of surfaces oa parameterized by a E€ (ae—€,ac+€) 
which are locally invariant, locally attractive, and tangent to the plane Ns41 = 
... = Nd-1 = 0 in a sense analogous to Eqs. (5.6.9)-(5.6.11). The tangency 
can be measured as in Eq. (5.6.11) in terms of an a priori given constant 
C>0. 
(it) Now assume that s = 2 and that A1(a) = A2(a) is the eigenvalue of La . 
with largest modulus for all a € J and that for a = ac E€ J it is |Ai(a-)| = 1, 
(A|A1(@)|Ja=a. > 0, ImA (ae) # 0 and Ax(ae)" Æ 1 for h = 1,2,3,4,5. 
Suppose that the vague attractivity of O near a. takes place because a condition 
analogous to Eq. (5.5.25), Ja. < 0 holds. Finally, assume that k is large 
enough and a— «a, is small enough. Then there is a set ono, which we denote 
Ta, invariant with respect the to action of Boa and homeomorphic to a circle 
fora > ae. Such a set is the intersection between o and a torus which is 
invariant for the solutions of Eq. (5.8.1) and attracts, exponentially fast, all 
the motions starting close enough to it. 


Observations. 

(1) Hence, in a similar way, as the vaguely attractive stationary points may 
bifurcate, in some circumstances, growing into periodic orbits, the periodic 
orbits may bifurcate growing into two-dimensional tori. 

(2) The proof of the above proposition is parallel to that of Propositions 11, 
§5.6 and 12, §5.7, and will not be discussed in detail (see problems at the end 
of this section). 

We only mention that the assumptions on the eigenvalues, at œ = Qe, are 
needed to be able to put the transformation into a normal form analogous to 
Eqs. (5.7.4) and (5.7.15), thus allowing us to formulate a vague attractivity 
condition like Eq. (5.5.25). 

(3) Proposition 14, together with Propositions 7-13 and the problems at the 
end of the §5.4-85.8, provide a quite general theory of the stability of the 
vaguely attractive stationary points and periodic orbits and of their bifurca- 
tions, when the regularity class of the differential equation is high enough. 

It then becomes natural to ask if it is possible to discuss in a similar fashion 
the theory of stability and bifurcations (following the loss of stability as a 
parameter a grows) of attractors or of more complex invariant sets. 
“Unfortunately”, such a question is very difficult, and it seems unsuited to be 
considered in too general a context. Only within classes of special cases, such 
a problem can be treated in some detail (e.g., in the case of the theory of the 
attractors “verifying the axiom A”).'® This is a theme of great interest, which 
seems to be connected with the theory of many phenomena more general than 
the ones of a purely mechanical nature, like the theory of turbulence which 
greatly stimulates research on this subject. 

(4) As a comment on the generality of the theory of this and the preceding 


18 For a definition, see [45] and, also, [42] and [7] for detailed discussions of some problems 
(References). 
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sections, we must stress that the vague attractivity of a point or of an orbit 
near a critical value a, is an interesting hypothesis, mainly for its elegant 
implications, but is far from being realized always (or even often). It often 
happens that simple systems of differential equations have stationary points 
or periodic orbits which are not vaguely attractive near a critical value a, 
where they lose stability. In such cases, there is no general theory guiding the 
theoretical analysis of the attractors, and various phenomena are possible, like 
the “sudden” (i.e., for œ just above a.) transition to an asymptotic regime 
governed by attractors of a nature more complex than a stationary point or 
a periodic orbit or a two-dimensional torus. Such attractors may be located 
far from the attractor that lost stability. 

In general attractors other than points, periodic orbits or tori run quasi- 
periodically are called “strange”: this qualifies the impossibility of describing 
these attractors as simple objects, rather than qualifying a well-defined math- 
ematical property. 

To illustrate Observations (3), (4) and to get some feeling for how com- 
plicated the pattern of the bifurcations may be even for relatively simple 
differential equations (with quadratic nonlinearities “only” ), we give a series 
of examples. 

Some of the results quoted below may be obtained via the theory of the pre- 
ceding section (like those relative to the stability of the stationary solutions, 
see §5.4 and §5.5 and the associated problems), possibly using a computer to 
estimate the eigenvalues of various stability matrices. However, most of the 
following results can only at present be obtained via the use of numerical 
experiments (usually fascinating). They should not be considered as mathe- 
matical statements but as empirical observations which may reveal themselves 
only as first rough approximations to the phenomena that the same nonlinear 
differential equations may show if studied more carefully. 

We leave to the reader, as interesting practical work, the task of checking 
the following statements analytically (when possible) or numerically (if he has 
access to a computer: for the purpose the software in Appendix U can be used 
in a first approach). 


5.8.1 A. Example 1: The “Lorenz Model”. 
Analytically, this is a system of equations that the reader can interpret as equations of 
motion of a gyroscope subject to suitable forces (following a scheme like the one in §5.1). 


The equations are 


&t£=-oxr+oy, yo=-or-—y-—2£z, z=-bz+yr-—a (5.8.6) 


a=10,b= z. This system admits a “symmetry” group, i.e. a group of maps transforming 


solutions into solutions: namely the two elements group consisting in the maps x’ = og, y’ = 
oy, z’ = z, 0 = +1. The following items describe the structure of the attractors. 


(1) For 0 < a < al = —b (ø — 1), there is just one stationary point that can be shown 
to be globally attractive for a small enough. It is locally stable and the eigenvalues of the 
Lyapunov matrix have a negative real part, Va € (0,a1), and, numerically, it appears to 
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be globally attractive all the way up to at. The stationary point is stationary for all œc but 
is unstable for a > al. It is 


x=y=0, ns (5.8.7) 
and the symmetry maps leave it invariant. 
or a, <Q < ag = 20 = oy .63, the precedin, oint undergoes a 
2) For al 2 = 20h te, = 1760 ~ 92.63, the preceding poi derg 


bifurcation, losing stability in one real direction but remaining vaguely attractive and it 
bifurcates in two locally stable stationary solutions which are mapped into each other by 
the symmetry maps. Such solutions exist for all a > at, but lose stability for a > a2. 
From a numerical point of view, a randomly chosen initial datum is attracted by one of the 
above two stationary solutions. The solutions are 


xz=y=+vya-—b(o + 1), z=o-1. (5.8.8) 


One should not think, however, that the possible asymptotically different motions consist 
of the three points of Eqs. (5.8.7) and (5.8.8). For instance, for a < a2 and close to it, there 
are some unstable periodic orbits, as can be rigorously shown.!9 

The reason why such asymptotic motions cannot be seen by sampling randomly the initial 
data space is that they form a set of zero Lebesgue measure. 

(3) For a > a the points of Eq. (5.8.8) lose stability. Such loss of stability takes place 
in two complex-conjugate directions because two complex-conjugate non real eigenvalues 
(42 fa?) of the stability matrix cross the imaginary axis from left to right. 


However, although the fixed points in Eqs. (5.8.8) still exist for all a > at, they are not 


vaguely attractive for Œ near a. Hence, one cannot apply the Hopf bifurcation theorem to 
infer the existence of a bifurcation into periodic orbits of each of the points of Eq. (5.8.8). 
In fact, a strange attractor shows up here, see Fig. 5.8. 


. TERY E 20 RT e T0 30 30 k 
Figure 5.8 Projection on the plane z = 0 of the fixed points of Eq. (5.8.8) and of a motion 
corresponding to a given initial datum randomly chosen; a = 200. The motion is not 
periodic. The marks are the projections of the (unstable) fixed points. 


It exists up to œ œ~ 230, disappearing occasionally only for some small intervals of @ when 
it is replaced by some stable periodic orbits: see Fig. 5.9, 5.10 


13 Applying Problem 16, §5.5, p.408, to either of the Eqs. (5.8.8) near a2, one computes 
the vague-attractivity indicator of Proposition 12, Eq. (5.7.5) and shows that it has the 
“wrong sign”, 7.0, and then one applies Problem 9, §5.7.1, p. 440. 
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-40 
-30 


40 
Figure 5.9 x,y projection of a periodic orbit relative to the case a = 340. The other 
periodic orbits that can be experimentally found turn out to be related to the above by the 


transformation z —> —x,y — —y,z — z which is a symmetry of the equation. 
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Figure 5.10 y, z projection of the orbit in Fig. 5.9. 


(4) For large @ the strange attractor disappears and is replaced by attractors consisting 
of periodic orbits, as it appears from numerical experiments. The existence of some stable 


periodic orbits can be proven rigorously for a large (see [41]). 


5.8.2 B. Example 2: Navier-Stokes equations on a two-dimensional 
torus with a five mode truncation. 


This is an example in which there are nice Hopf bifurcations. It is, however, more compli- 
cated than Example 1. It could also be interpreted mechanically as a system of two coupled 
rigid bodies with a rather strange looking coupling. However, this mechanical interpretation 
does not seem to be particularly useful, and we do not discuss it. The physical origin of the 
model has to be searched for in the theory of fluids. The equations are 
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Y1 = — 241 + 49293 + 474795, 


y2 = — 972 + 39198; 


43 =— 593 — T12 +Q, (5.8.9) 
Ya =- 54 — V175, 
ys == 75 — 3y. 


The equations are symmetric under a four elements symmetry group, namely yı — 


EY1, V2 > EVY2, V3 > V3, Y4 O94, Y5 `> EOYs With e,o = +1. 
(1) For a small, the obvious stationary solution, existing Va > 0, 


Q 


WS ea a Os B= (5.8.10) 


is stable and globally attractive [this could be proved along the lines of the proof of Eq. 
(5.2.12) in Proposition 4, §5.2, p.371]. By the Lyapunov criterion, it remains stable up to 


al = 5/8. Up to this value it appears, numerically, that it is a global attractor. 


(2) Near at, Eq. (5.8.10) is vaguely attractive and loses stability in one real direc- 
tion, generating two stable attractive solutions (5.8.11), mapped into each other by the 
symmetries 


(5.8.11) 


— (a= al) 0 1 
—- 4/ la- al), y =7s=0, e= +l. 
7V6 i 4 
Such solutions exist for all œ > ai and, numerically, they seem to be globally attractive as 
long as they are locally stable: this means that randomly chosen initial data are attracted 
by either of them, see the comment to the point (2) of the Example 1 above. 
They lose stability for a = a2: 


80 /3 
2 

az = —4/ = 5.8.12 
FENE (5.8.12) 
The stability loss takes place in just one real direction again and, again, each of them 
bifurcates into two new stable solutions which are locally attractive for a € (a2,a3), but 


persist for all a > a2. If e,o = +1 


P TE PI 3 ve 90 
= ’ 2S , ’ 


o [9 3 9 3 5 
Či 2 TE 2 
8 =Z Bg" T BN gA 5, 


and a3 = 22.8537.... The four points are mapped into each other by the symmetry group 
elements. 

At a = a8, Eqs. (5.8.13) lose stability in two complex directions and, apparently, they 
remain vaguely attractive. In fact, one can easily find, numerically, that in their vicinity 
there is a stable periodic orbit, as if a Hopf bifurcation had taken place (in principle, 
one could even check rigorously whether the vague-attractivity indicator 7 is negative, as it 
probably is). The symmetry implies that the periodic orbits will be several: each bifurcating 
at a = a from one of the four fixed points that become unstable. One of them is drawn in 
Fig. 5.11. 
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Figure 5.11 y4 — yı projection of the fixed points and periodic orbits after the bifurcation in 
which the points of Eq. (5.8.13) lose stability (a = 28). Eq. (5.8.9) has a fourfold symmetry 
(€,€ = +1) which can be used to generate three other orbits and fixed points symmetric to 
the one in the picture by applying the symmetry transformations mentioned in the text. 


The structure of the motions for a > a® is quite fascinating. At various values apt at? 


an, ... there appear new periodic orbits bifurcating from the preceding ones because the 
latter lose stability in one real direction, with the stability matrix of the Poincare transfor- 


mation showing the largest eigenvalue crossing the unit circle through —1. 


’ 
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Figure 5.12 y1 — y4 projection of one of the (fotir) orbits which‘arise b¥ a doubling bifurcation 
from one of the orbits of Fig. 5.11 for a = af! (a = 28.60). the other three doubled periodic 
orbits are obtained from this one by the symmetry operations. 


Such cases, although not contemplated in Proposition 14, can nevertheless be theoreti- 
cally treated under suitable vague-attractivity assumptions, and their theory predicts that 
the periodic orbit “doubles”, doubling also its period,29 see also Problems 10-13 for §5.8. 


20 This can easily be understood intuitively by arguing as in the Observation (6) to Propo- 
sition 12, p.431. Write the Poincaré map as o alx) = (—1 — (a — ac))z + p(x, a), 
assuming that x p(x, a)/z* —> 7 < 0. One easily finds that there are two points 


Ltwa,t—jq X y- + (a -— ac) mapped into each other by @. This means that the 
orbit “doubles”. 
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Figure 5.13 Further doubling Of the orbit of Fig. 5.12; a= 58.650. bay 


The sequence of such bifurcations seems to be infinite and has been observed un- 
til the period has reached approximately 2° times the initial value. The accumulation 
point limp oo ae”, as experimentally measured by a computer, seems to be a = 


28.6681.... 
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Figure 5.14 Further doubling; oa = 28.666. 
For @ = a50 = 28.663... , there appears a new fourfold family of periodic orbits 
(symmetric of each other under the symmetry group) that in the narrow interval @ € 


[a®, ah] coexists with the preceding ones, although they are also stable. A randomly 


chosen initial datum is attracted by one of the stable orbits of the two families, i.e. by one 
of the eight stable periodic orbits. 

One of the new orbits (quite different in structure and location in phase space) is drawn 
in Fig. 5.15. 
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Figure 5.15 One of the four new orbits of the family that ‘is born at a = a59 = 28.663 
for œ = 28.663. The other four orbits are obtained from this by transforming it with the 
symmetries of the equation. 


As a grows beyond a50 these new orbits also undergo the same fate, doubling after 
losing stability into a double orbit at @ = a51 
at a5?, etc. “indefinitely” with an accumulation point at @ = a509 = 28.7201 .... An 
example of the bifurcation is drawn in Fig. 5.16. 


which, in turn, doubles into a double orbit 
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Figure 5.16 Figure 5.16. Doubling of the orbit in Fig. 5.15 for a > aml, a= 28.710. 


For & > Gre, it seems that the motion is asymptotically described by a strange 


attractor up to Œe ~ 34 with the exception of at least one small interval of values of œ, 
very small, where asymptotic behavior is again ruled by some periodic orbits which, as a 
grows, lose stability “again through —1” doubling in period infinitely many times. See the 
3; Y1 projection of the attractor for a = 31. 
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Figure 5.17 Projection of an orbit with an asymptotic motion governed, apparently, by a 
strange attractor; a = 31. 


After Qe, the motion seems to be governed by periodic and globally attractive orbits 
whose period and shape vary regularly with @ (as before, here global “numerical” attrac- 
tivity means that if the initial datum is randomly chosen, it converges to one of the above 
periodic motions). An example is drawn in Fig. 5.18 
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Figure 5.18 a = 34; an attractive periodic orbit, 71, y3-projection. 


4b es ee 


We stress that the adjective “numerical”, referred to some properties of the solutions, 
means that such properties come out of a computer-assisted study and that they are not 
mathematically rigorous. 

Another exceptionally interesting and marvelous property of the above sequences of 
bifurcations is that, numerically, the sequences 


apett = abn anti = abn 


A4,n 4,n—1? 
c 


5,n—1 
oo c 


5,n 
Qe — 


seem to converge to a limit ot which is pz? ~ 4.67. This is a numerical value which is 


conjectured, “Feigenbaum conjecture”, to be “universal”, i.e., independent of the particular 
differential equations giving rise to stable periodic orbits which successively grow out of 
doubling bifurcations when one of them, stable at a given value of @, loses stability as a 
grows, giving rise to a stable doubled orbit, [15]. 
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However, it is an open problem to formalize in satisfactory generality and to give man- 
ageable sufficient conditions for a proof of the validity of this fascinating conjecture which 
seems to be verified in several cases studied numerically (and different from the above- 
considered ones). Recently, considerable progress in this direction has been achieved (see 
[9], and [8], and [30]). 

The structure of the just discussed bifurcations is illustrated by Figs. 5.11-5.18, repre- 


senting projections on several planes of trajectories of Eq. (5.8.11). 


5.8.3 C. Example 3: Navier-Stokes equations on a two-dimensional 
torus with seven modes. 


A system exhibiting periodic orbits bifurcating into two-dimensional tori along the scheme 
suggested by Proposition 14 is the following: 


ji = — 271 +.4V59273 +4V59095; 
42 = — 972 + 3V8N73, 


43 = — 573 — TV 57192 + 97177 + a, 
qa == 544 — VBI, (5.8.14) 
4s =— ys — 3V 5714 — 5716, 
Ye =— Y6 + 57195; 
Ww = — 5y7 — 9173, 
which can be discussed in a similar way as that of Example 2. 
The structure of the bifurcations and attractors is considerably more complicated and 
interesting. We do not discuss it in detail, feeling that Figs. 5.19-5.23 will, by themselves, 
excite the reader’s curiosity and will stimulate him to read some original papers on the 


profound theory of Feigenbaum, [15], and on Example 3 as well as on Examples 1 and 2 
(see [15], [18], [19], [17],[47]). 
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Figure 5.20 a = 71.60; the preceding orbit has originated a stable torus (two dimensional) 
run quasi-periodically by the motions of Eq. (5.8.14), one of which is shown here. 
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Figure 5.21 œ = 190; another stable periodic orbit. 
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Figure 5.22a = 190; another stable periodic orbit which coexists with that of Fig. 5.21. A 
randomly chosen initial datum, at this value of a is attracted either by the periodic motions 
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of Figs. 5.21 and 5.22 [or some of their images by the symmetries of Eq. (5.8.14)] or by the 
quasi-periodic motion which takes place on the torus of Fig. 5.23. 


Figure 5.23 a = 195: a stable two-dimensional torus run quasi-periodically by the motions of 
Eq. (5.8.14). This torus is an attractor apparently bifurcating from one of the periodic orbits 
in Fig. 5.21. Tori of dimension 2 can be quite easily identified by plotting a 2-dimensional 
section and checking that if can be fitted by a smooth closed curve: this can be done for 


instance for the torus in this figure. 


All the equations of the above examples, as noted in Examples 1 and 2, can 
be interpreted as equations governing some strange systems of coupled rigid 
bodies, but they have been considered in the literature as equations approx- 
imating the differential equations describing the motion of simple fluids (like 
the “Euler” or the “Navier-Stokes” equations or the “thermo-fluidodynamics” 
equations). Their connection with the mechanics of rigid bodies is not surpris- 
ing, however, if one notes that the classical fluid equations (Euler or Navier- 
Stokes equations) can be considered as equations describing infinitely many 
coupled rigid bodies (with very strange and, perhaps, mechanically unnatural 
coupling); this remark becomes clearer if one recalls that the equations of mo- 
tion of the fluid bodies are usually derived by thinking of them as consisting of 
many small rigid bodies and applying to each of them the cardinal equations 
of mechanics. 


We shall not further pursue the discussion of the models of dissipative 
systems and of their stability theory. This is a subject under current intense 
investigations, and the contents of §5.1-§5.8 provide some introduction to the 
literature. 


5.8.4 Problems and Complements 


1. Let o € C~(R4) be a map of R? into itself with the origin as a fixed point. Write 
x’ = (x) as 


x’ = Lx + F(x). 
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where L is a d x d matrix and F has a second-order zero at the origin. Suppose that the 
eigenvalues of L are pairwise distinct. Show that there is a linear change of coordinates that 
allows us to put the above map into the form 


aP =(Redj)a\? — (Im rj) + FP (x) 
xf =(Imdj)2 — (Re j)a$? + FP (x) 
ad =), + F (x) 
with j = 1,...,s, h =2s+1,...,d, where \1,...,As5 are the s complex non real eigenvalues 


of L and A2s+1,...,Aq are the (d — 2s) real eigenvalues of L; FY), pO), ... FO have a 
second-order zero at the origin 0. (Hint: Proceed as in the proof of Proposition 7, p.393, 


85.5.) 


2. In the context of Problem 1, suppose that d = 2, = A1 = {complex non real}. Let 
( 


z= af) + ia). Show that the map can be written as a map of C into itself: 


z! = àz + F(z,2), 
where F has a second-order zero at z = 0. 


3. Show that if A? 4 1,A Æ 0, the map in Problem 2 can be written in a new coordinate 
system as 


C= AC + N(G,¢), 


where N has a third-order zero at the origin ¢ = 0. (Hint: Proceed as in the proof of 
Proposition 8, p.396, §5.5, i.e., write F(z,Z) = a22? +a1 zZ + aoz? 4 N(z,2) with N having 
a third-order zero at z = 0. Change variables near z = 0 as ¢ = z + A22? + A12Z + Aoz? 
and choose the A’s in order to eliminate the second-order terms from the map in the new 


coordinates.) 


4. Show that if A4 41, 40 the map in Problem 3, of C into itself, 


C= AC + N(C,C), 
with N having a third-order zero at¢ = 0, can be put into the form 
z’ = dz + b2lz|7 + Q(z,z) 


with Q having a fourth-order zero at z = 0, using a change of variables (near the origin) of 
the form: z = ¢ + A3¢3 + AoC2E + ACC? + AoC. 


5. In the context of Problem 4, show that the map can also be written as 
z! = dz eblzl?+9(0,9) 


near z = 0, where z = ge’? and Q is a C™ function of (0,0) € Ox T' with a third-order 
zero at the origin of the @ variable. 


6. Consider the map defined as follows: let z = oet? and 


z! = Na) ze@Ml1?+O(0,9,0) = 6, (z) 


where Q € C™([0,p) xT! x (—a, a)), A,b € C™((—a, a)), |A(0)| = 1, and Q has a third-order 
zero at @ = 0, for all 0 € T+, for all a € (—a,a). Show that the origin is vaguely attractive 
near zero if Reb(0) < 0. (Hint: If Reb(0) < 0 the origin is attractive for a =0....) 
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7.* Let (a) = e®t*(@) and, in the context of Problem 6, let Reb(0) < 0. Show that 
the maps a have an attractive invariant set of approximate equation |z| = [Reva 
for a > 0 small. (Hint: Proceed as in the analysis of the Hopf theorem, performing the 
analogous steps and estimates.) Actually (but this is more difficult than the above problem), 
the invariant set is a curve homeomorphic to a circle. The proof of this could be achieved 
by writing the equation of the unknown curve as 


20) = J aoe ETEO” 


and trying to determine ¢(@) by writing the condition that the above curve is a invariant, 
i.e., 


EE + e(0))? + (VE QU + el), 8,0), 


1+ e(0") =(1 + e(0))e T FO)" +. (ya), (1 + el0), 0,0), 


where Q, Qı are smooth functions of their three arguments. The equation can be solved 
recursively. The proof, however, is not really straightforward (see [29]). 


8. Prove the first part of Proposition 14 for d = 2,s = 1. (Hint: Proceed as in the proof 
of Proposition 11, §5.6, p.411. Here the transformation in Problem 6 plays the role played 
there by the equation in normal form.) 


9. Prove the second part of Proposition 14 for d = 2, assuming that the invariant set of 
Problem 7 is actually homeomorphic to a circle and making use of Problems 2-7 for the 
reduction to normal form. 


10. Consider the C% map © of Rt into itself: 


x’ = B(x) = Ax + g(x) 


with g E€ C®(R) having a second-order zero at the origin. Show that if A Æ 0,1, there is a 
change of variables transforming the above map into a new one having the form 


E = AE + ECE) 


with y in C®(R), for € near 0. (Hint: Let g(x) = Gx? + G(x) with J having a third-order 
zero at the origin. Set € = z + Ga? and find a suitable G.) 


11. In the context of Problem 10, show that if 


a! = —(l+a)¢ + 2°4(2, a) ees Balx) 


is a family of maps of class C% parameterized by a with y € C®(R?), (0,0) > 0 then 
there exist two points x4 (a), x— (a), for a > 0 small, such that 


Sa(z+(a))=2-(a), Salz- (a)) = z4 (0), 


i.e., constituting a period 2 orbit (“doubling bifurcation”). Furthermore, show that by the 
Lyapunov criterion, such an orbit is stable and attractive. (Hint: Use the implicit function 
theorem to find z4 (a), say, as a root of #2 (x) = x. Prove the stability by applying the 
criterion of Lyapunov, Proposition 13, §5.8, p.441, to x+(a) and to the map 82.) 


12. Consider a map x’ = (x,a) of RÌ into itself, parameterized by a € R. Let € 
C™(R4 x R), let the origin be a fixed point of the map, for all a near zero, and let L(a) be 
its stability matrix. Suppose that for a € (—a, a), all the eigenvalues of L(a) are pairwise 
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distinct and such that |A1(0)| = 1 > v > |A2(0)|,...,|Aa(0)|, with v < 1. 

Using the attractive manifold theorem described in the first part of Proposition 14, p.442, 
and Problem 11, show that if \(@) = —1 — a then the origin undergoes a “period doubling 
bifurcation” as a grows through zero (in the sense of Problem 11). (Hint: Use the attractive 
manifold theorem to reduce the problem to a one dimensional problem and then apply 
Problems 10 and 11.) 


13. Prove that Problem 12 implies that if (x, a) is (an arbitrary extension of) the Poincaré 
map for a periodic orbit of a one-parameter family of differential equations in Rtt, then 
the periodic orbit bifurcates to a stable (exponentially attractive) periodic orbit, as œ grows 
through 0, with roughly a double period. 


14. Study the map x’ = 4aa(1— x), x E€ R, and show that [0,1] is an invariant set if 


a € [0,1]. Find the first bifurcation of the fixed points x = 0 and x= za > 0, ta = 1— zz 


(consider the latter only for a > 4). Show that in some sense £a grows out of a bifurcation 
of x = 0; while when za loses stability, it undergoes a doubling bifurcation in the sense of 
Problem 11. 


15. Consider the map ® in Problem 14 for a = 1, restricted to [0,1]. Show that the change 


of variables y = 2 arcsin \/x transforms this map into the mapW: 


2 if0<y< 4, 
Wry y ea y 2 
2(1—y) if5<y<1. 


Draw (roughly) the graph of ¥” and show that X” has (by inspection of the graph) 2” fixed 
points which correspond to 2” periodic points for W. Deduce that ® also has 2” periodic 
points of period n (here the period is not necessarily minimal). 


16. Using Problem 15, show that ¥ and @ have a dense set of periodic points. (Hint: Look 
at the graph of ¥”.) 


17. Study the stability of the fixed points of the map of R? — R? parameterized by a, b 
(“Henon’s map”): 


H(2,y) = (y— aa? +1, be) 


with b real and find whether one of its fixed points undergoes, for some fixed value of b, a 
doubling bifurcation as aœ grows using Problems 11 and 12. 


18. Let x — (x) be a C% map of the plane into itself which is invertible and area 
preserving (i.e., area E = area@—!(F) for all measurable sets Æ). Which relation between 
the eigenvalues of the Lyapunov stability matrix of a fixed point follows as a consequence 
of the conservation of the area? 


19. Same as Problem 18 for a volume-preserving map of RÅ into itself. 


20. In the context of Proposition 13, p.441, show that the eigenvalues of the stability matrix 
of a periodic orbit depend neither on the particular system of coordinates introduced on 
o nor on the point o chosen on the orbit. They are “characteristic numbers” of the orbit 
itself. (Hint: This is a problem analogous to Problem 15, p.388, §5.4. The first statement 
is proven in exactly the same way. To prove the second, use the trajectories to “transfer” 
a system of coordinates on ø (through £o) into a system of coordinates on o” (through &, 
etc).) 
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5.9 Stability in Conservative Systems: Introduction 


...desinas ineptire 
et quod perisse vides perditum ducas 


Stability of Hamiltonian motions is a natural problem arising, perhaps for the 
first time, in the theory of the solar system, where it is still unsolved. 

In nature there are many interesting systems which are “quasi-integrable” 
in the sense that their equations of motion differ, up to “quasi negligible” 
terms, from equations of motion of an integrable system. 

A nice example is provided by the solar system which we consider via a 
model in which the solar mass M is +oo, i.e., the Sun is a fixed point mass 
attracting the planets with a central force with potential energy inversely 
proportional to a planet distance and directly proportional to its mass. In 
the approximation in which the reciprocal attraction among the planets is 
neglected, it is clear that the solar system is described by as many Hamiltonian 
integrable systems as the number of planets (i.e., nine), one for each planet. 
In Chapter 4, §4.9.1 and §4.10.1, we saw that such Hamiltonian systems are 
integrable in the sense of Definitions 10 and 11, §4.8.1. 

It is then attractive to think that the actual motion of the solar system is 
“close” to this idealized motion followed by nine independent planets. 

Keeping for simplicity, the approximation that the Sun is a point mass 
fixed with respect to the fixed stars, we must compare the solutions of the 


following two systems of equations: i = 1,...,9, 
i Km; x® 
(i) i 
mx” = ROP KO (5.9.1) 
, Km; x® mm: (x® — x0) 
gO n Am O oœ mm (x — x) 
Mix’ = |x]2 |x| 2 (x — x(9))2 jx@ = xO ) (5.9.2) 


at least for initial data which, put into Eq. (5.9.1), give rise to trajectories 
on which |x — x |, i Æ j, remains so large as to make the second term in 
the right-hand side of Eq. (5.9.2) small compared to the first. The constant 
£ is the universal gravitation constant, M1, ...,mMg are the masses of the nine 
main planets, K = eMs, where Mg is the Sun real mass; satellites, comets, 
asteroids, rings, etc. have been disregarded. 

Choosing as time origin the flying away instant, the situation in which the 
solar system is initially found is, as is well known, such that the term in € in 
Eq. (5.9.2) has a modulus quite a bit smaller than the term representing the 
Sun attraction. The first question, preliminary to the comparison between 
the solutions of Eqs. (5.9.1) and (5.9.2), is whether this situation remains 
unchanged as time goes by. 

This property can be easily verified through the explicit solution of the 
various Kepler problems in the case of Eq. (5.9.1). Hence, this question is 
intimately related to the comparison between Eqs. (5.9.1) and (5.9.2). 
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From the general results of the theory of ordinary differential equations, it 
is evident that “close equations yield close solutions” ; however, this closeness is 
not uniform over time. It does not, indeed, follow from the regularity theorems 
and the initial data and parameters dependence that close equations with 
close initial data produce solutions which stay close forever or solutions whose 
trajectories, as sets, remain close. The first possibility is almost always false. 

One then asks if the corrections to the equations of motion (5.9.1) due to 
the presence of the term in € in Eq. (5.9.2), though small, may lead to changes 
in the motions which, in the long run, result in a motion very different from 
the one foreseen in Eq. (5.9.1). 

A priori, one could even consider “unthinkable” or undesirable catastrophic 
events, like interplanetary collisions or capture of a planet by the burning Sun. 

Of course, one wishes to have analytic instruments for the solutions of Eq. 
(5.9.2) and of comparison with those of Eq. (5.9.1). The analysis should allow 
not only the exclusion of such catastrophes, but even to show that it is true, 
or essentially true, that the planets movements are described by Eq. (5.9.1). 
And furthermore that, if needed, one can compute or estimate the deviations 
between the motions of Eq. (5.9.1) and those of Eq. (5.9.2) with equal initial 
data at least for long times, i.e., of astronomical magnitude, long compared 
with the revolution periods of the various planets. 

In other words, one wishes to use Eq. (5.9.1) for “rough” astronomical 
predictions and to have algorithms to compute the corrections at least for 
times of the order of magnitude of several thousand years. 

That this is a delicate problem can be deduced from the fact that rough 
estimates, too pessimistic, of the errors lead to the conclusion that the re- 
ciprocal influence between the planets may become important within a few 
years. 

For instance, the time necessary for a collision between two heavenly bodies 
of the size of Venus and Earth, assuming that at time zero they are standing 
still (relative to the fixed stars) at a distance d(T,V), equal to the actual 
Earth-Venus maximal observed distance, could be estimated not longer than 
Teo Such that (accelerated motion estimate) 


Emr +My 
2 d(T, V} 


Hence, we see that even to establish some accurate predictions for times 
of a few centuries, a remarkable precision is needed, i.e., it is necessary to 
take into account the fact that the planets motion at the initial time is very 
far from a situation bound to a collision and that, obviously, the corrections 
to the motion described by Eq. (5.9.1), originated by the additional terms 
in € in Eq. (5.9.2), are not always favorable to collisions (or escapes, etc.). 
Think of the two-body problem where the systematic attraction results only 
in providing a curvature to the trajectory. On the average, the effects favorable 
to catastrophic events may be much smaller or even totally absent with respect 
to the above pessimistic calculation. 


T2u =4(T,V) => Troi 370 years; (5.9.3) 
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This and similar problems, which may obviously be formulated for systems 
very different from the solar system (like harmonic oscillators with conserva- 
tive anharmonic additional perturbing forces or, more generally, for systems 
“close” to integrable systems), are typical stability problems for conservative 
systems. 

To the above problems, one adds analogous problems of stability of in- 
tegrable systems perturbed by the addition, among the active forces, of ex- 
ternal forces varying with simple time laws (“non autonomous Hamiltonian 
systems” ). 

All of the above problems are much more difficult than one might imagine, 
perhaps naively. Only recently have some techniques apt to provide some 
answers been developed (and are being developed), although we are still quite 
far from a “satisfactory” theory even for very small perturbations. 

The main result on this theme is the following theorem (“Kolmogorov- 
Arnold-Moser theorem” ) which we shall analyze in some particularly interest- 
ing cases in the §5.12. The reader who wishes to obtain deeper insights can 
consult [33, 34]. 


15 Proposition. Consider a mechanical system in RÌ with £ degrees of 
freedom, subject to conservative forces with potential energy Bo E€ C®(RÌ) 
bounded from below and subject to ideal constraints. 

Suppose that the system is canonically integrable on some open set W of the 
phase space (see Definition 11, p.289, §4.8) and call Ho its Hamiltonian. 
IfI: W—V x T° is the integrating transformation and if we set (A, p) = 
I(p,q), the motion in (A, p) coordinates is, by definition, 


Si(A, p) = (S(T (A, »))) = (A, + w(A)t), (5.9.4) 
where w = (w1(A),...,we(A)) are £ pulsations corresponding to the £ prime 
integrals A = (Ai,..., Ac), and w(A) = Oaho(A) if ho(A) = Ho(I~'(A, ¢)) 
[p independent because of the integrating character of I, see Observation (1), 
p.289. Assume V to be bounded, and that the matrix 


A Ow; 
~ ðA; 


Jij (5.9.5) 
has non vanishing determinant on all of V (“non isochrony” of the system). 
Then, if Y € C® (RÌ) is a uniformly bounded potential energy, the mechanical 
system with the same constraints but with an active force with potential energy 


Py + eY (5.9.6) 


has various remarkable properties which will be described calling se) and 
ge), t E€ R, the transformations generating the motions, corresponding to 
Eq. (5.9.6) and to the given constraints, in the coordinates (p,q) and (A, 9), 


respectively. If (A, p) = I(p,q), then 


5.9 Stability in Conservative Systems: Introduction 461 


SP (A, p) = 1(S(p,q)) (5.9.7) 
(and gi) (A, œp) is only defined for those pairs (A, p) for which Eq. (5.9.7) 


makes sense). 
(i) There is a subset W©) C W invariant for the transformations si) and a 
map FEN WOoVS) x T’, V©) CV, invertible and continuous, denoted 


F(A, p) = (a(A, y,¢),W(A, 9, ¢)). (5.9.8) 


Furthermore, there is a continuous function Q©) :W© — R! such that 


F©) (S:(A, ~)) = (a(A, P, €), P(A, P, E) ay 2) (A, p) t ): (5.9.9) 


Therefore, the motions with initial datum in W®) can be thought of as rota- 
tions of an L-dimensional torus. 

(ii) The set W© C W is generally only measurable in the sense of Lebesgue 
and not necessarily in the sense of Riemann, and its measure is such that 


volume W (©) 


volumeW €70 (5:9:19) 


(iii) The functions (e, A, p) > F©(A,) can be eatended to C) functions 
with arbitrary preassigned k on (—1,1) x V x T+ and the same can be said 
of the functions (A, p) > Q©) (A, p). Furthermore, such extensions have the 
property 


Oho (A 
POA, p) = (Ap), 2O, p) = AOA) 


(5.9.11) 


(iv) If the original system is an analytic analytically integrable system and * 
is also analytic, then one can take k = +00 in (iti). 


Observations. 

(1) This theorem tells us the sense in which perturbing an integrable sys- 
tem with proper pulsations “really” variable, see Eq. (5.9.5), i.e., “non 
isochronous”, one obtains a system that can still be thought of as a system 
moving essentially in the same way as the unperturbed one, see Eq. (5.9.11). 
(2) W) can be thought of as foliated into invariant /-dimensional tori with 


equations 


(A, p) = (F®)-1(a, a), wet (5.9.12) 


parameterized by @ parameters a € V). By Eq. (5.9.11), each of such tori is a 
slight deformation of the torus described by {a} x 7“ in the original variables. 
(3) Observation (2) is interpreted as saying that the foliation of the phase space 
into invariant tori (characteristic of the integrable systems) is, at least in the 
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canonically integrable anisochronous cases, preserved under small perturba- 
tions, provided one disregards a subset of phase space with small measure. 
(4) The fact that V) can only be shown to be Lebesgue measurable (and it 
probably cannot be chosen Riemann measurable) is quite unpleasant because 
it means that W“), although containing many points (for € small) cannot 
be approximated by “nice sets” and, therefore, it becomes difficult to decide 
constructively whether a given point is or is not in W“). However, a little 
thought shows that (iii) partially solves this problem from a practical point 
of view. 

(5) Note that a ¢dimensional torus in a 2¢-dimensional space?! does not 
split the space R?“ into “interior” and “exterior” parts, unless 2 = 1. This 
is perhaps what makes clearer the incompleteness of the result (iii). In fact, 
a point beginning its motion in W/ W), i.e., outside the invariant tori, may 
“sneak” through the tori of the foliations very far from the vicinity of the 
unperturbed torus on which it would move if ¢ = 0. This phenomenon, called 
“Arnold diffusion”, is not well understood, [36]. 

It would be nice to understand criteria sufficient for the existence of a 
Riemann-measurable set of initial data (possibly with positive measure) which 
does not undergo the Arnold diffusion. I.e., implying that, although only a 
Lebesgue measurable set of points in phase space moves essentially as if the 
perturbation were not present (i.e., quasi-periodically, on tori close to the un- 
perturbed ones), there is a Riemann measurable set of points moving (perhaps 
not quasi periodically) close to the unperturbed tori located near the initial 
data. 

In fact, this is what the numerical experiments sometimes seem to suggest. 
It can be rigorously proved for some non autonomous 1-degree of freedom 
systems (once the above theorem is extended, as can be done, to the non 
autonomous system with external periodic forces of Hamiltonian type) or for 
2-degrees-of-freedom autonomous systems. 

In such cases, however, the entire problem disappears as the motion takes 
place on a three-dimensional set (because, in the first case, the “phase space” 
is three dimensional (p,q,t) and in the second, although the phase space is 
four dimensional, the motion takes place on the three-dimensional surface of 
constant energy), and in R? a two-dimensional torus has an interior and an 
exterior. 

(6) The above theorem cannot be applied to perturbations of harmonic oscil- 
lators since the non isochrony condition of Eq. (5.9.5) is manifestly violated. 

Nevertheless, if the £ pulsations wo = (w1,...,w¢) of the harmonic oscil- 
lator verify a “non resonance” or “Diophantine” condition: JC < œ, œ < co 
and 


|w- v|! < Cv |, Vv #0 (5.9.13) 


21 or REL if energy is taken into account, unless £ < 2. 
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where v = (14,...,¥e) € Z° is an “integer vector”, then Eq. (5.9.5) can be 
replaced by a condition on W. Namely, if f(A, ẹ) is the function ¥ in the 
(A, y) variables, f(A, p) = (IHA, ¢)), (A, p) € V x T°, and if we define 


folA) = Ga | -T'HA p) de, (5.9.14) 


and if the matrix Aeh. i,j = 1,..., £, has non vanishing determinant V A € 
V, the theorem’s results (i), (ii), (iii), and (iv) hold without change. 

(7) Even worse is the situation of the solar system, i.e., if one tries to apply 
the above theorem to Eq. (5.9.2) as a perturbation to Eq. (5.9.1). 

The problem lies not so much in the unboundedness of the potentials in the 
Kepler motions. In fact, in a vicinity W of the Kepler motions of the actual 
planets, there are no collisions, so the perturbation is bounded there (W has 
to be thought of as a subset in the nine planets phase space R2” x R?"), 

The difficulty lies in the fact that for the unperturbed system described 
by Eq. (5.9.1), Kepler’s laws hold and say that each planet moves periodically 
with pulsation w;; and, therefore, the system moves quasi-periodically with 
nine independent pulsations instead of the 27 that should be present if the 
system were really anisochronous and the condition (5.9.5) cannot hold (since 
two of the three pulsations of each planet i have to be integer multiples of w;). 

Nevertheless, it is possible to find a version of Proposition 15 covering 
this problem at least in some nontrivial cases of N gravitating point masses 
attracted by a fixed center and attracting each other (see also p.493). 

Without quoting the exact results, we mention one of their consequences: 
there exist quasi periodic motions of the planets (i.e., solutions of Eq. (5.9.2)| 
which take place on almost circular, almost closed, and almost coplanar orbits 
of distinct radii, provided the masses are very small; hence, there are motions 
of Eq. (5.9.2) quasi-periodic and without collisions or escapes. 

The last statement and result solves a problem which for centuries fasci- 

nated physicists, mathematicians, and astronomers. Newton’s universal gravi- 
tation law is not incompatible, by itself, with the stability of the solar system, 
a fact empirically observed since millennia and hoped for by everybody. Nev- 
ertheless, it remains an open question whether or not our own solar system, 
modeled by Eq. (5.9.2), is actually stable: the initial data on the positions and 
velocities of the planets and their masses seem too far from values to which 
the above-mentioned extensions of Proposition 15 can be applied. 
(8) The proof of Proposition 15 gives much more information than its text 
expresses. It might even be possible to extract from it, and from its extensions 
mentioned in Observation (6), some astronomically interesting results. How- 
ever, much work has to be done, since the results of Proposition 15 and its 
extensions are seldom obtained in “optimal” form. Actually, to my knowledge, 
careful estimates based on the proof of the theorem and taking full advantage 
of the peculiarities of a given equation of interest have begun to appear only 
relatively recently in the simplest cases. 
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(9) The ideas for the proof of the above theorem arise from perturbation 
theory for classical Hamiltonian systems: to it the next section is devoted. 
In 85.11 and 85.12 it will be shown how the ideas of perturbation theory 
may be applied to prove Proposition 15 in the simplest case of a canonically 
analytically integrable system, analytically perturbed. 

It is a shame that the old classical perturbation theory, which gave rise to 
analytical mechanics and to the Hamilton-Jacobi method, is nowadays almost 
forgotten since many people seem to know or care only for the quantum- 
mechanical perturbation theory. This fact is largely responsible for the aura 
of mystery which still seems to surround the above theorem. 


5.10 Formal Theory of Perturbations. Hamilton—Jacobi 
Method 


Or ti riman, lettor, sovra ’l tuo banco, 
Dietro pensando a cio che si preliba, 
S’esser vuoi lieto assai prima che stanco. 
Messo t’ho innanzi: omai per te ti ciba; 
Che a se torce tutta la mia cura 

Quella materia ond’io son fatto scriba.22 


Consider an é-degree-of-freedom system with a Hamiltonian function H 
on an open set W in phase space. Denote (p,q) the points in W and denote 


(p,q) > H(p,q), (p,q) EW (5.10.1) 


the Hamiltonian function H. 

We shall suppose that this system is canonically analytically integrable via 
an analytic canonical transformation J integrating it by transforming W into 
V x T! with V C Rf, open and bounded.?? 

The transformation J transforms the Hamiltonian H into a function of the 
first Z variables of (A, 4) = I(p,q): h(A) = H(I-'(A,¢)), Y (A, p) EV xT". 

We recall that the variables A are called “action variables”, while the p 
variables are called “angle variables” . 

Let F be an analytic function on W and consider the Hamiltonian system 
described on W by the Hamiltonian function 


22 Tn basic English: 
Now stay, o reader, on your bench, 
thinking about what is foreshadowed 
if you wish to be happy before being tired. 
I did initiate you: now proceed by yourself; 
as my whole thoughts are absorbed 
by the matter about which I am scribe. 

(Dante, Paradiso, Canto X) 
23 See Definition 11, p.289, $4.8. 
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H(p, q) + £F (p,q) (5.10.2) 
or, in the action-angle variables, (A, 4) € V x T°, by 


h(A) + ef(A, p) with (5.10.3) 
h(A)=H(I7\(A,y)), f(A, p) = F(I71(A, ¢)) (5.10.4) 


Perturbation theory proposes to compare, for € small, the motions of the 
system with Hamiltonian h and those of the system with Hamiltonian h+ef, 
usually with the same initial data. 

As stated in 85.9 the comparison methods for solutions of a differential 
equation depending on a parameter (Lyapunov criterion, attractive manifold 
theorem, Hopf theorem, etc.) often reveal themselves to be inadequate in 
the analysis of the problems and difficulties connected with the stability of 
conservative systems. Such problems appear quite different from those arising 
in the theory of dissipative systems, at least at the beginning (although the 
advanced theory ultimately may conceptually coincide). 

However, the special form of the Hamiltonian equations permits the use 
of a simple algorithm, of great interest for applications, for the analysis of the 
motions of quasi-integrable systems. 

The idea is to change variables via a completely canonical transformation 
(A, vy) — (A’,¢’), arranging things so that the “old” Hamiltonian (5.10.3) 
takes the form 


alr) (A’) + er FCA g’) (5.10.5) 


in the new variables, where (A’, ') denote the new variables and a”), (n) 


are analytic functions of € near 0, of y’ € T* and of A’ in a suitable open set. 

Hence, for € small, the error that would be made supposing that in the 
variables (A’, py’) the system is integrable and described by the Hamiltonian 
ns”) is very much smaller than the one that would be made assuming the 
system as integrable in the original variables (A, 4) simply setting € = 0 in 
Eq. (5.10.3): provided, as we suppose as an extra essential requirement of 
construction, the canonical transformation itself is not singular at € = 0. 

Intuitively, neglecting in Eq. (5.10.5), or, better, in the Hamiltonian equa- 
tions associated with Eq. (5.10.5), the y’-dependent term produces an error of 
the order e”+1T in the equations solutions, if they are observed up to a time 
T. Hence, given an approximation 7, it will be possible to retain it, although 
neglecting the influence of f”*' on the motions of Eq. (5.10.5), for a time of 
the order Tye x ne~("t), 

For £ small this may give substantially better result for n > 0 than the one 
corresponding to the simple, but often too rough, analogous approximation 
with n = 0 (ie., € = 0 in Eq. (5.10.3). 

The reader will realize that the method that will be used for the “reduction 
to higher order” of the perturbation via a canonical transformation is nothing 
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more than a method for constructing successive approximations to the time 
independent solutions of the Hamilton-Jacobi equation Eq. (3.11.6), p.213. 

There are two remarkable cases which can actually be treated along the 
above lines, building a completely canonical transformation changing Eq. 
(5.10.3) into Eq. (5.10.5) at least for (A,y) in a neighborhood of the form 
So(Ao) x TE C V x T*, where $,(Ao) is a sphere with radius o in Rf around 
a preassigned point Ag, and for some n > 0 and £ small. 

The first case arises when 


hh) SpA AG) (5.10.6) 
with wo € R! such that there are C,a > 0, for which 
C= sup ——,— < +o (5.10.7) 
ve Z*,vA~0 |v | 


The second case arises when the Fourier coefficients of the development of f: 


f(A, p) = 5 fre" S i= oar fh f(A, pje” dye (5.10.8) 


vez! 


vanish for |v| > N and, setting 


w(A) = A one has (5.10.9) 
lw(Ao) v| >0, Vue 2°, 0<ļ|v|<N. (5.10.10) 


In the first case, it is even possible to put the Hamiltonian into the form of 
Eq. (5.10.5), Yn = 0,1,... , provided € is small enough (depending, however, 
on the choice of n). 

The above statements are illustrated in the following classical propositions. 


16 Proposition. Consider the Hamiltonian (5.10.3) on V x T* with 


FA p= Y fe? (5.10.11) 


vezt 
|vISN 


analytic on V x T*, with N > 0, and suppose that, Y Ag € V, the function h 
is such that 


Jw(Ag)-v)>0, Vue 2, 0<|pI<N. (5.10.12) 
Then there exist 0, > 0,€1 > 0 and, Ve € (—€1,€1), a completely canonical 
transformation (A, p)-(A’,y’) defined for (A,y) € Wz, with V x T! > 
Wz D S15, (Ao) x T“ and with values onto S, (Ao) x T°, smoothly depending 
on € and transforming the Hamiltonian (5.10.8) into 
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nD (A’) +e? f(A’, p), (5.10.13) 


where AL, ) are analytic in £, A’, yp’. Furthermore Aw can be given a sim- 
ple expression; see Eq. (5.10.25) below. 


Observation. As mentioned above, the reader should interpret the proof that 
follows as a “perturbative solution to order £” of the Hamilton-Jacobi equation 
in the time-independent case, i.e., when H in Eq. (3.11.68), p.226, does not 
explicitly depend on t. Actually, the above proposition is the basic example 
of how the method of Hamilton-Jacobi concretely works. Most applications of 
the Hamilton-Jacobi’s method are based on this proposition. 


PROOF. The canonical transformation will be determined by looking for a 
generating function ®, see §3.11 and §3.12 from p.222 on. 

Such a transformation is expected to be close to the identity up to in- 
finitesimals O(¢), thus the unknown generating function will be written as 


A’- p+ A’), (5.10.14) 


where (A’,~) > A’ - ẹ is the generating function of the identity map and & 
is infinitesimal in £ . The function @ will be determined by requiring that the 
Hamiltonian in the new variables (A’, y’) defined by the formal map 


A =A' + sata), 


5.10.15 
ae ( ) 
p =p + a AP) 
i.e., the function 
op op 
h(A!' + | (A! Als (A' 10.1 
( aa :p)) +ef ( Le P) p) (5.10.16) 


is y independent up to terms infinitesimal of higher order in e. 

Since, as already said, we expect that ® ~ O(e), we can heuristically find, 
by developing Eq. (5.10.16) in series with respect to we. that the equation 
for & (the “Hamilton-Jacobi equation to first order in €”) is 


a (A‘)- soa vy) +ef(A’,y~) = {p — independent function} (5.10.17) 


which, written in terms of the Fourier components of ®, means that if 


i(w(A’)-v)B,(A') +e fL(A’) =0, Vue 2%, |p| >0 (5.10.18) 


This equation is really a soluble equation if |A’ — Ao| < 74, with J4, so small 
that the closure of Sz, (Ao) is a subset of V and therefore 
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w(A’)-v #0, Vue 2, o<jo|<N. (5.10.19) 
see Eq. (5.10.12). Then we can define in Sz, (Ao) x TJ‘ the analytic function 


(A e 3 Jul a (5.10.20) 


V 
O<|vI<N 


It follows from the implicit function theorem, see Appendix G, Corollaries 3 
and 4, that the second of Eqs. (5.10.15) can be uniquely inverted with respect 
to » and the first of Eqs. (5.10.15) can be inverted with respect to A’ in the 
respective forms 

p=y' + A(A', g"), A E C™ (Sp, (Ao) x T“) (5 10 21) 
A' =A + E'(A, p), 2 eC™”(Sa (Ao) x T’) a 


if e is small enough,”4 i.e., if |e] < £1, with 21, suitably chosen; and also there 
is B > 0 such that 


oP = 
Igo) = IF'(A,)| < Bel, (5.10.22) 


so that, if Ble| < 4g,, the maps (A’,p’) > C(A’,¢’) = (A, 9): 


O® 
A =A’ + —(A’, "4 A(A’, 1 ; 
gene TARSI (5.10.23) 
p=yp' + AA’, g’) 
and (A, ~) > C’(A, p) = (A’, g’): 
Al =A + F'(A, 9), 
A (5.10.24) 
p =e + a At TA, gp) e) 


are well defined on S15, (Ao) x T“ and take values in Sz, (Ao) x T‘. Further- 
more, C and C’ map Siz, (Ao) x T“ into Siz, (Ao) x T’ and CC = C'C = { 
identity map} on S 17, (Ao) x T° by construction (and by the uniqueness part 
of the implicit function theorem). 

Therefore, the Jacobian determinants of C or C” on Siz, (Ao) x T! cannot 
vanish and, hence, by Proposition 21, §3.11, p.220, C is a completely canonical 
map of Siz (Ao) x T* onto its image W; D Siz, (Ao) x T’. So we take 
E= Po +01). 

By the construction of & [see Eq. (5.10.17)] the Hamiltonian function in the 
(A’, y’) variables has the form of Eq. (5.10.13). By substituting Eq. (5.10.20) 
into Eq. (5.10.17), one, in fact, also obtains 


4 =’ and A are C® also in e, jointly with (A, p) or (A’,¢’), by the implicit functions 
theorems in Appendix G. 
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AY (A’) = h(A’) + efo(A’), (5.10.25) 


€ 


where fo is the 0-th Fourier coefficient of f, see Eq. (5.10.8). 

The analyticity of the canonical maps C and C’ will not be discussed here. 
It follows if Eqs. (5.10.21) are obtained via the application of analytic implicit 
function theorems that will be discussed in the next section; see Propositions 
18-20. mbe 


The above discussion is the basis for the most common algorithms in the 
calculations of the perturbed Hamiltonian motions; it leads to the natural 
idea of iterating the procedure by reducing the perturbation from O(e?) to 
O(e*) , ete. 

The difficulty lies in the fact that, in general, the new Hamiltonian 
(5.10.16) which, to first order in £ reduces to Eq. (5.10.25), no longer has 
the form necessary for applicability of Proposition 16. In fact, the pertur- 
bation of order £? will be a function of (A’, ~’) which has all, or at least 
infinitely many, harmonic components in y’ non vanishing, disregarding ex- 
ceptional cases. One can convince oneself of this with some thought, noting 
that SE (Al, y’ + A(A’,¢’)) contains terms like e’4(4"") and, unless some 
“miraculous” cancellations take place, will no longer be trigonometric poly- 
nomials in gy’. 

The following proposition, valid in the other case considered in the in- 
troduction to Proposition 16, is quite interesting because it shows that with 
a slight modification of the method of the above proof but under different 
assumptions, one can “remove” the perturbation to an arbitrary order in €. 


17 Proposition. Consider the Hamiltonian function given by Eq. (5.10.3) 
on V x T° with h verifying Eqs. (5.10.6) and (5.10.7) and f analytic. There 
is o > 0 such that: 

(1) For each n = 0,1... . there exists €n > 0 and, V |e| < En, functions Ben 
defined on S,(Ao) x T! and analytic in £ and in the other arguments (A, ¢), 
generating completely canonical transformation (A, p)— (A', p’) such that 


Ben 
A =A'+ a (A, p), 


(5.10.26) 


IBe n 
P =p + -z (AY), 


mapping a subset Wen, S3 (A0) x Tf CWen CV xT", onto S(Ao) x T*. 
(2) The map of Eq. (5.10.26) transforms the Hamiltonian into the form 
(“Birkhoff normal form”) 


he n(A’) + eT f(A’, g") (5.10.27) 


where hen(A') is analytic in £, A’ and fi? is also analytic in £, A',p’. An 
explicit expression for he»(A’) is Eq. (5.10.41). 
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Observation. The construction described in the proof of this proposition is 
often referred to as the “Birkhoff transformation”. 


PROOF. Define heuristically: 


n 


Benl Ap) = > eB (A’, p) (5.10.28) 
k=1 


and consider the Hamiltonian in the new variables (A’, ~), Eq. (5.10.26): 
Ben 
dp 


Developing this expression in powers of £ using the analyticity of f and h (the 
latter is actually linear) in A, impose that the resulting series in €, 


(A’,y)) +ef(A'+ oe (A’,¢), p). (5.10.29) 


h(A' + 


So pal p) e, (5.10.30) 


> 
Il 
= 


has all the coefficients Y™), k = 0,1,...,n, p-independent. 


This condition allows one to determine recursively 6, ..., 6 [and it 
appears that ¢6 is given by Eq. (5.10.20), of course]. 
Then, once the expressions for &,...,6(™ are found, one shall write 


Eq. (5.10.26), and by taking £ small, proceeding exactly as in the proof of 
Proposition 16, the implicit function theorem will be used to guarantee that 
Eq. (5.10.26) actually defines a canonical transformation between S,(Ao) x 7“ 
and some Wen C V x T° and Wen D Si,(Ao) x T . The invertibility 
conditions will depend on n. By construction, Eq. (5.10.27) will then follow, 
with hen, A? of class C” in £, A’, y’. They are actually analytic and this 
point can be commented as at the end of the proof of Proposition 16, see 
p.469. 

Hence, the whole problem is to show that one can find &),...,6(™ so 
that the formal series of Eq. (5.10.30) has the first (n + 1) coefficients with 
harmonics in ẹ of order v Æ 0 vanishing. This is a purely algebraic problem. 

As amply exploited in the following section, where the question will be 
more systematically treated, the analyticity assumption on f implies that it 
can be developed in the Taylor series about Ao and in the Fourier series in p 
in the form 


F(Ag)= So IPA- Aire A FP (Ao,p)(A— Ao)? 
aZ! vez ac Zt 
(5.10.31) 
where, see Definition 13, p.336, a = (a1,...,ae) € ZÉ and v = (v1, ..., ve) € 
Z and 
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£ 
(A = Ao)? = [4 Er Aoi)”, YV: = 5 Vipi (5.10.32) 


i=1 
and, furthermore, there are R > 0, oo > 0, £o > 0, such that 


O| < Rg "eE, = vac 24, Vu Ee 2, (5.10.33) 


def def 

if |a| = y 1 Gis lv) = =e 1 [vil]. 

This inequality is not immediately obvious and it will be discussed in §5.11; 
for the time being, we suppose and use Eq. (5.10.33) without discussion. 

Developing Eq. (5.10.29) in powers of € and collecting the terms of equal 
order in £ and setting f®(A’,~) = a with al“ I- ai! (it is 
the a-th coefficient of the Taylor expansion of f around AY at fixed y), one 
finds [using Eq. (5.10.31)] 


* £ aj ni)( Ar 
Mav={oPae DS TC) 


aczi ae eer n£ j=l s=1 
apik) He apk) 
A’ NP (A! A’ 5.10.34 
for k = 1,2,..., and the x» means that the sum is performed subject to the 


2 EEDA 
constraint )7j-1 oe, ni = k — 1. Furthermore, we set 


W'0)(A’, ~) = h(A’) (5.10.35) 
The condition that y is -independent (hence, y’ independent) becomes, 
by Eq. (5.10.34), 


AB) (A, p) 


f(A’, p) +wo- = {œ — independent function} (5.10.36) 


and it determines 6“), up to a function of A’ alone, as: 


(A’)e ivp 
PY (A (Ave? = I 5.10.37 
Oe 2 inw =i woi l l 


where f,(A’) is the v-th Fourier coefficient of f(A’, p) at A’ fixed: 


f(A) = X OAA = Ao)? (5.10.38) 


aczi 


Replacing f,(A’) in Eq. (5.10.37) by Eq. (5.10.38) and using Eqs. (5.10.33) 
and (5.10.7), one sees that the series in Eq. (5.10.37) converges and defines a 
C® function of (A’,~) € So (Ao) x T* (actually such a function is analytic, 
as could be shown). 
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Then, from Eq. (5.10.34), it follows that 


OPM (AS, 
ss (A', p) 


5.10.39 
BG; ( ) 


2 
NO(A p) =X FOW, 
j=1 
with e1 = (1,0,...,0), e2 = (0, 1,...,0).... 
From what has been said above, it follows that N? is a C% (So, (A0) x 
T“) function (actually analytic), and if NP (A’) denotes its v-th Fourier 
coefficient, the condition that Y® in Eq. (5.10.34) is y-independent yields 


N ADEYE 
OAvE 24 9 


which, again from Eq. (5.10.33) and from Eqs. (5.10.31), (5.10.37), and 
(5.10.38), turns out to be a C% function on So, (Ao) x T* (actually analytic), 
etc., inductively. Hence 


hen(A’) = ho(A’) + 5 e* NSP (A’). (5.10.41) 
k=1 


mbe 


Observations. 

(1) Equations (5.10.37) and (5.10.40) and their generalizations to higher k 
show that N“*)(A’, ~) can be chosen to be n independent. It becomes natural 
to consider the limit as n — oo. In this limit, the perturbation would disappear 
and the Hamiltonian would be transformed into 


co 
he(A’) = h(A’) +Y e NS (A). 

k=1 
and it would therefore be integrable. However, the estimates on €n that can 
be derived by applying the scheme suggested in the above proof appear to be 
such that £n -ps 0, save some exceptional cases. Therefore, nothing can 
be concluded about the limit n — +00. 
It is known that it cannot happen, in general, that both series (“Birkhoff’s 
formal series” ). 


Sick NM (A), Sr ek (A) (5.10.42) 
k=1 k=1 


converge, defining analytic functions of (A’,y,¢) in (A’,~) E€ Sp,(Ao) x 7“ 
and in £ near zero and, at the same time, entia pr) Fore’ 9 uniformly in 
the same region of (A’, y, €). 

This would, in fact, imply the existence of @ prime integrals analytic in £, A, g 
for e close to 0, A close to Ag and y € T*: namely, (A,..., A), and via such 


5.10 Theory of perturbations 473 


integrals (“uniform integrals”), the system would be analytically integrable, 
with a canonical transformation with such integrals as the new action vari- 
ables. This property has been shown to be impossible in a number of inter- 
esting cases. 
A simple example in which the series (5.10.42) can be explicitly computed is 
in Problem 16 at the end of this section: in the example the second of (5.10.42) 
does not converge. However if N (A) depend on A via w- A only, then the 
series converge: this is a nice criterion (see [44]). 
(2) Various algorithms used in practice to study perturbations of integrable 
motions are based on the two propositions illustrated above. The simplest is 
the following. 

First, develop f in a Fourier series. This usually causes great problems. 
In fact, it is often possible to compute only a few Fourier coefficients for f. 
However, on the other hand, such coefficients often decrease, as v — 00 , very 
quickly. Then, if f is written as 


f= f + PM, (5.10.43) 
where, for f given by Eq. (5.10.31), we set [see Eq. (5.10.38)] 


FMa D PA- ere DO Ae? (6.10.44) 


acZz! vest vezt 
|v|<N IvISN 


one has that ¢f!>%! is very small even for N not too large and its contribution 
to the Hamiltonian equation produces an error, in a fixed given time, much 
smaller than O(e), say O(en) with 7 < 1. 
It is then possible to apply Proposition 16 to the system with Hamiltonian h+ 
ef PN] and remove the perturbation to O(e?). In the new variables, neglecting 
the perturbation of O(c?) will cause an error, over a fixed time, of order 
O(e? +en) on the solutions of the original equations. This is often a very good 
approximation if w(A)-v 40, V0 < |v| < N, VA € {set of interesting initial 
actions}. 
(3) A special case of great importance to which, however, the above algorithm 
cannot be applied directly is that of the perturbations of the motion of the 
Kepler system when, in defining the unperturbed system, one neglects the 
reciprocal attractions between the planets [i.e., one takes Eq. (5.9.2) as a 
perturbation of Eq. (5.9.1)]. 

As we saw, the Kepler motions are rigorously periodic, and to every planet 
a single pulsation is associated rather than three: the other two vanish (or are 
integer multiples of the first, depending on which variables are chosen to inte- 
grate the motion) as a consequence of the conservation of angular momentum 
and of the wonderful nature of the Newtonian force which singles it out among 
the central forces as the most impressive, see §4.9 and §4.10. 

It is therefore certainly impossible to satisfy Eq. (5.10.10) with reasonable 
N. Hence, the above approximation scheme cannot be applied. 
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Nevertheless, a similar scheme can be applied. Consider the motions in 
action-angle coordinates (A, 9), where A = (AM,...,A™), y = (®,..., 
yp”), where (AY, p0)) are the natural variables, for the systems Sun-i-th 
planet, in terms of which the Hamiltonian takes the form (if 14, = Sun mass), 


m; = i-th planet mass, e;; = G4, see p.458: 


Ms 
EMM; 
= Eij (5.10.45) 
A- ana] oa a ean 
ho(A) = X Fo( Al”), (5.10.46) 
i=l 


having denoted AY? the first component of AY = (AM, AY, AM), and we 
recall that A“ can be chosen as follows (see problems for §4.10): 


dep (5.10.47) 


AX? =m;A(j) cos i(j) = 6 ;, 


where A(G j) is the areal velocity of the j-th planet, Ej its energy, a; is the 
major semiaxis of its orbit, and i(j) is the inclination of the jth orbit on 
the ecliptic plane (the ecliptic plane is traditionally the plane of the Earth 
orbit or more precisely a reference plane fixed with the stars and parallel to 
a conventional average plane of the Earth orbit). 


The angle variables associated with such action variables are yl = 


VAS yp? = = g0), pÑ ) = hO) known in astronomy as the “average anomaly”, 
the “major Seians longitude” and the “node-line longitude” with respect to 
the fixed axes established on the ecliptic plane (i.e., on the zy plane of the 
chosen inertial frame); see Problems 11 and following to §4.10, p.303, for a 
discussion of these variables. 


Equation (5.10.46) shows that in the unperturbed motions, g¥, h are 


constants (i.e. w = wP = 0) since ho only depends on the variables L;, j = 


Ts see gstt 

One can then proceed to write the perturbation in Eq. (5.10.45) in terms of 
the (A, 4) coordinates (a nontrivial task, in practice; see Problem 15, p.305, 
§4.10 for the similar question in the case of the planar problem), and after- 
wards one can try to apply the scheme seen in the proof of Proposition 16 
to build a canonical map (A, 4) — (A’,’) transforming Eq. (5.10.45) into 
a function independent on the yi ) variables, j = 1,...,n, to first order in 
= = maxé;;. One shall proceed as prescribed in the proof of Proposition 16, 
G) 0) 

13 


considering p5 as parameters. 
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If we call f(A, p) the perturbation term of Eq. (5.10.45), when expressed 
in the action-angle variables apt to describe the unperturbed system, we 
introduce the new canonical variables (A’,y’) via the generating function 
A’-~+@(A’,¢) with 


P A’ wp 
ee ae 


where N is a “large number” which we imagine here to have chosen such that 
for some a priori given purposes, neglecting f!>%! in Eq. (5.10.45), produces 
a negligible error. In this way we obtain a Hamiltonian having the form 


n(A,..., 909, ...) = lA!) 


1) tay O f (5.10.49) 
+ehi( A... ÁS 3 Po p)”, 
and the equations of motion will become, j = 1,...,n, 
AY =0,  j=1,...n 
His ðh 1 ‘in 1 , 
a 54) (Ap eds i se); eS 2,8 
iG Ohi 4/0) (n), (2) ' 
AY) =— - a RS ies SY, o=2 
o a 1 P2 Yn ) (5.10.50) 
E} Oh + + 
JGO — 0 (1) (n) 
p aA i ee a LAG ) 
Ohy ‘a (n ‘(2 t 
ary (A; Me Ag ), z ), pr), 
up to O(e?). 
G 


Since hı is p, ) independent, the equations in curly brackets form a system 
of Hamiltonian equations parameterized by the initial data of A ) and with 
2n degrees of freedom. Once they are “solved”, the last of Eqs. (5.10.50) is an 
ordinary differential equation expressing pË ) 
t and, therefore, it is “trivial”. 

In celestial mechanics, it sometimes happens that inside the neighborhood 
W =V x T®, of interesting initial data hı itself can be written as hi + phy 
where hy, is an integrable Hamiltonian and p is a “small” parameter. 

It will then be possible to apply again perturbation theory, Proposition 
16, to study the motion of the Hamiltonian system in Eqs. (5.10.50) described 
by the second and third line equations, as a perturbation of a simple motion, 
(see problem 2 at the end of this section where a similar but simpler situation 
occurs). 


in terms of a known function of 
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A very interesting case when this happens is the case when the unperturbed 

motion of the planets that one considers is a motion in which the planets wan- 
der around orbits with small eccentricity and small inclination. The resulting 
parameter p is of an order of magnitude related to the maximum eccentricity 
and to the maximum inclination. 
(4) The representation of the planetary motion thus obtained is very sugges- 
tive: the planet keeps moving with roughly the same revolution period on the 
same elliptic orbit [in Eqs. (5.10.50), the first and the last equations say that 
the average anomalies rotate with about the same unperturbed pulsations up 
to O(c); but the node lines and the major semiaxis longitude have a move- 
ment developing on a very slow time scale of O(e~') because of the factor 
€ in the curly bracket equations in Eqs. (5.10.50)] called “precession” which 
is quasi-periodic with the periods characteristic of the Hamiltonian h1. 2° In 
the same quasi-periodic way vary the inclinations of the orbits and the areal 
velocities. The main motion obtained by neglecting O(c) in Eqs. (5.10.50) 
should be called a “deferent motion”, while the O(¢) corrections expressed 
by the o = 2,3 differential equations in Eqs. (5.10.50) should be called the 
“epicyclical” motions, to do some justice to the Greek astronomers and to 
Ptolemy, in particular. 

The above “Ptolemaic” description is accurate only to O(e? + ew)T if T 
is the time for which one wishes to make astronomical predictions. 

The reader should consult books on celestial mechanics to see concrete 
applications of the procedures and approximation schemes to some astronom- 
ical problems (among which the simplest is the theoretical calculation of the 
precession of the perihelion of Mercury). 


5.10.1 Exercises and Problems 
1. Apply the idea of the proof of Proposition 17 to study the Hamiltonian system 


wo: A+eg(%), (A,~) ER! x Tf 


with wo verifying Eq. (5.10.7). Deduce that the system is integrable for small e (for an 
alternative solution to this problem, see Problem 1, p. 290, 84.8). 


2. Apply the scheme suggested in Observation (3), p.473, to discuss to higher order the 
motion associated with the system in RETI x T4+1: 


A+e(B- wo + 19(A,B, 9, ~)) 


if (A, B, p,p) are canonical action-angle variables A € R,B € Rf, p € T!,w € T! (note 
that for € = 0, this system has only 1 frequency rather than £+ 1). Explicitly calculate the 
“daily” and “secular” components of the motions to O(c? + eu) after finding the secular 
Hamiltonian hı, see Observation (3), p.473, and assuming a “non resonance” condition on 
wo like Eq. (5.10.7). 


25 N $ X i PEEN z 
° It is called a “secular motion” since in some simple cases this time scale is of the order 
of centuries. 
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3. Same as Problem 2 for the system in R?: 4 (p3 + q?)4 + (p3 + q3) +e (2q1 + q2)*. (Hint: 
Find the action-angle variables (A1, A2,¢1, 2) when € = 0 (just polar coordinates) for 
the two oscillators; then completely canonically change variables A = $(At + A2), B = 


4(Aı A2), p = pı — p2, Y = pı + p2 and then apply the method of Observation (3), 
p.473.) 


2 
4. Same as Problem 2 for the system in R?: Pi} + e(qı — q2). 
2 3 ya +a ( ) 


5. Consider the “restricted three-body problem” in R?: 


2 2 
P P km, km2 mime 
H (pi, p2, 41,42) = —+ + — E : 
2mı 2m2 |a| laz] lqi — q2 


(p,q) € R?. Using the results of Problem 15, p.305, §4.10, write (with patience) up to 
second order in the eccentricities of the two bodies the Hamiltonian in the action-angle 
variables corresponding to € = 0); see Problem 11, p.303, §4.10. Show that if the eccen- 
tricities are neglected, together with quantities of order O(e?), the secular motion [in the 
language of Observation (3), p.473] is described by the Hamiltonian 


an mk? m3k? 2m da Emme 
ho +ehy 22 32 O ea ET 
f. 
i 2 0 ui) lo? +a — 2a! ah cosa 


(where L = mVka, a = {major semiaxis}; see Problem 11, p.303, §4.10) (“0-th order in 
the eccentricity” ). 


6. Show that in the context of Problem 5, the secular Hamiltonian h1, of the Hamiltonian 
in Problem 5 is eccentricity independent even to first order in the eccentricity. 

Does this mean that, to first order in the eccentricities, the Kepler ellipses remain fixed 
in space? (Answer: no.) Show that they move quasi-periodically “without full precession” 
(i.e., g1, g2 vary continuously with a small amplitude of oscillation, i.e., < 27) to first order 
in the eccentricities. 


7. Show that to second order in the eccentricities, the secular Hamiltonian of Problems 5 
and 6 depends both on the L’s and on the e’s (i.e., on the G’s) and has the form (without 
explicitly computing fij,) hi = ROL, LL) +e fri (L4 L5, gh — g1) + 2e es fis hyo 
g1) +e? f22(L4-L5, 95 — g1). Show that the above secular Hamiltonian is integrable and that 


it says that, if h are nontrivial, the relative position of the perihelions precesses to O(e?). 
(Hint: Use Problem 15, p.305, §4.10. Canonically change variables as 


Gi+G2 _ ~ Gi-Ge 
= z °? ee as G= 5 


and note that the Hamiltonian “effectively” takes the form of a Hamiltonian for a one- 
dimensional system (integrable by quadratures or by the Hamilton-Jacobi method).) 


y=gi +22, G 


8. In the context of Problem 7, attempt a concrete computation of fij and of the angular 
velocity of the precession, assuming that the unperturbed motions take place on ellipses of 
small eccentricity and with semiaxes a1, a2 such that a2 — a, is “of the order” of a; and a2 
(i.e., with quite different semiaxes). 


9. (i) Let IP(L) C R! be a cube centered at the origin and with side 2L. Let v € Z£, |v| > 0 
and let T (L) be the set of the points w € I'(L) such that exi 


of Te(L) does not exceed eV0(2LV0)£—!. (Hint: Just look at the geometrical meaning of 
the inequality lov]. e é, the v£ arises from |v| = See, [vil < Vert i Iv:12)2.) 


[y| 


< €. Show that the measure 
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(ii) Deduce that the measure of the set Tc of the points w € I’(L) such that |w- v|~! < 
C|v|£, Yv £0, has a complement with Lebesgue measure not exceeding 


“larva IVE aa 


wiso |” 


(see, also, Problem 11). 


10. Using Problem 9 show that UcIc = TE T(L) has the same Lebesgue measure of 
I'(L), i.e., (2L)*, although its complement is dense. 


11. Without using the Lebesgue-measure theory, infer from the inequalities of Problem 9 
above that I’, in Problem 10, is a dense set in P (L). 


12. Consider a time-dependent Hamiltonian with one degree of freedom: ho(A)+¢ fo(A, p, t), 
where (A,y) ER! x Tt and t €€ T! is interpreted as the time appearing in a 27-periodic 
time-dependent perturbation to the system with Hamiltonian ho. 

Develop a formal perturbation theory for the above system proving propositions analogous 
to Propositions 16 and 17 of this section. (Hint: Use a time-dependent canonical transfor- 
mation with generating function A’y + @09(A’,,t) and proceed, as in this section, using 
the Hamilton-Jacobi method.) 


13. Consider the time-dependent system on R! x Tt, a + e (cosp + cos(y — t)), and 
applying the results of Problem 12, remove the perturbation to O(e?) near the points with 
A=wo= 3 (1 + v5) (see exercises and problems to §2.20 for the theory of the number wo). 


14. Same as in Problem 13, but to O(e*). (Warning: The calculations are quite long.) 
15. Let h(A) be a C® function defined on a sphere Sp(Ao) C R with gradient w(A) = 
Ont) bounded by |w(A)| < E and such that the matrix Mij = ea 


Ta a A € So(Ao) and pee 1|(M~*)i3| < n < +00. Suppose that the correspondence 
A — w(A) is one to one between S,(Ao) and w(So(Ao)). Denote, for C > 0: 


is invertible 


So(Ao,C) = { A] A € So(Ao), |w(A)- v7! < Clv]*, Y |v] > 0} 


Show that there is B > 0, depending only on £, such that 


vol Se(Ao, C) . , _ B (Eno) 
volSe(Ao) ^ EC 


(Hint: Use the change of variable formula: 


l dA = dA — dA 
So(A0,C) So(Ao) So(Ao)/So(Ao.C) 
A 

= val Stay) -f jat Ajia 
w(Se(Ao))/Se(Ao,C) Ow 

> vol (Sp(Ao) -x f dw 

w(So(A0)/So(A0,C)) 

> vol (Se(A0)) - £ X` ita dw 
VA0" wv|/|v|<C~!|v|-e-1 

> vol (Sp(Ao)) — nO TH 2EVĄ VE X aa 

aise! 


and then recall that volSo (Ao) = conste.) 
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16. Let w = (w,1) € R? be such that |wvı + v2| < C (|vı| + |v2|)% for some a,C > 0. 
Let f be a function on 71 with Fourier coefficients fy # 0,Vv Æ 0, eg., f(y) = 
2554 e §" cos ny, € > 0. Consider the Hamiltonian system on R? x T?: He = 
(w1 Ay + A2) + (€2 + f(~1) f(Y2)). Show that the Birkhoff formal series (5.10.42) are 


he(A’) =(wAi + A2) + € (A2 + f(yi)f(y2)), and 


—i (wy v wV v 
= eet ole (wni + V2) 1 +12 


and prove that the series for Pe does not converge. (Hint: Using the explicit solubility of 
the equations for He, see Problem 1, 84.8, p.290, one sees that the passage to action-angle 
variables for He must be singular for a dense set of values of e: the singularities arise in 
correspondence of the values of £ for which the formal sum of the @--series makes no sense 
(to sum formally the series permute them).) 


17. In the context of Problem 16, show that the function ®-, obtained by permuting >>, and 
>¢, and summing the geometric series, makes sense and is analytic in A, p for many values 
of £ and, whenever this happens, He is indeed integrable by the canonical map generated 
by s. (Hint: Use Problem 9 above to identify the values of € which allow bounds of the 
type |wv1 + (1 +e)r2|7! < C |v|®, C,a > 0, |v] > 0.) 


5.11 Some Simple Properties of Holomorphic Functions. 
Analytic Theorems for the Implicit Functions 


In 85.10, we mentioned, without discussion, some properties of the analytic 
functions. Such properties can be derived in the more general context of the 
theory of holomorphic functions. 

Such functions are basically defined as analytic functions of complex vari- 
ables, i.e., a C?-valued function f defined on an open subset W C C% is holo- 
morphic if it can be developed in an absolutely convergent power series around 
each point of W. 

For a more detailed discussion of some perturbation theory problems, it 
is convenient to state the following definition which is general enough for our 
purposes. It is a definition that is provided more with the aim of fixing some 
notations rather than with the objective of developing the part of the holo- 
morphic functions theory that we need. In this and in the following sections, 
we suppose that the reader is familiar with the basic properties of holomorphic 
functions, i.e., the Cauchy integral formula the theory of the Taylor-Laurent 
expansions in power series, and the identity principle. Such properties will be 
repeatedly used in 85.11 and §5.12. 

Let £, p,q be positive integers. 


8 Definition. (i) We introduce the following notations: Va = (ay,...,Q¢) € 
Z$, = (v, ..., Vell) € Z£, 
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£ p 


lal =X Jah lel = X Jri: (5.11.1) 


i=1 i=l 


while if w = (wi,...,we) € C 


q 
je = gx ful, wll =D el (5.11.2) 
ga 


(ii) For Ao E Cf, o > 0,€ > 0 andj =1,...,p we set 


S (Ao) ={A|A EC’, |A— Ao| <o} 
C(é) ={z|z€C?, e5 < |z| < e }, (5.11.3) 


C (o, £; Ao) =So(A0) x C(8). 


The first two such sets will be called, respectively, the “complex multisphere” 
with center Ag and radius o, and the “complex multiannulus”, with inner 
radius e78 and outer radius e&. If Ao is real, we define 


So(Ao) ={A|A ER, |A; — Aoil < 0} (5.11.4) 

calling it the “real multisphere” with center Ao and radius o. 
The set So(Ao) x T* will be identified to a subset of C (o, £; Ao) via the map 
(Ay) = (Az), sel (5.11.5) 


(iii) If W C C4 is open and if Fis a C?-valued function, we say that F has 
a convergent power series expansion around wo E W if there is a family of 
C? -vectors {F® (wo)}acz4 such that for some 0 > 0: 


F(w) = X` F@ (wo)(w —wo)*, Y |w- wol < 0, (5.11.6) 
aczi 
having set 
q 
(w — wo)? = [Le — woj)”, for a= (a1,...,a@q), and (5.11.7) 
j=l 
y IF® (wo)| ol?! (wo), VVo<0 (5.11.8) 
acz{ 


(iv) A function F is holomorphic in the open subset W C C3 if it has a 
convergent power series around every point w E€ W. In this case, one defines 
the derivatives of F as 


alal F (wo) 


Ow? 


=alF®)(wo), VwoeW (5.11.9) 
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. def 
if al = Jj- aj! 


Observation. One calls Eq. (5.11.6) the Taylor series of F at wo, because of 
Eq. (5.11.9). 


9 Definition. Let l, p,q be positive integers and Ag € R! and use the nota- 
tions of Definition 8 above. 

(i) Let f,g,h be three functions defined, respectively, on So(Ao),So(Ao) x 
T?,T” with values in RI. We shall say that they are holomorphic in S,(Ao); 
C(o,€; Ao), C(E) respectively if, identifying Sp(Ao),S (Ao) x T?,T” as sub- 
sets of S,(Ao), Clo, €; Ao), C(E), as explained in Definition 8 (ii) above, 
they can be extended to holomorphic functions f,g,h on the larger sets 


~ 


So(Ao), C (0, £; Ao), C (£). 

The functions of the type f,g,h will be called “holomorphic” in S (A0), 
C(0,£; Ao), C(E), respectively, and “real” on So(Ao),So(Ao) x TP, T”, re- 
spectively. Sometimes the extensions f,g,h will still be called f,g,h, dropping 
the bar. 

(i) If F is holomorphic on C(0,£&; Ao) or on C(E), we define its “p- 
derivatives” by setting oo = 12k Ja? Raa: 


Observations. 

(1) It is easy to deduce from the definition of an analytic function on V x 7“ 
(see Definitions 13 and 14, p.336 and p.337, §4.13) that if f,g and h are an- 
alytic on V or on V x T?, 7”, respectively, then given Ag € V, there exist 
o,€ > 0 such that f is holomorphic in S,(Ao), g in C(o,€;Ao), and h in 
C(€). In general, however, gp and € may be very small even if V is large. 

(2) This definition is particularly useful because it provides a simple descrip- 
tion of an important class of functions on 7” or on S,(Ao) x J”, thinking of 
T? as a subset of C(&) via the natural correspondence 


P = (¥1,---) Pp) E TP z= (et, ... ea) (5.11.10) 
already pointed out several times. 


The classical theorems on the theory of the holomorphic functions (Taylor 
and Laurent expansions, Cauchy’s formula, identity principle, etc.) imply the 
following proposition which we do not prove since it can be found, with other 
symbols, in any elementary textbook on holomorphic functions. 


18 Proposition. Let f,g,h be holomorphic functions on S, (A0), C (o, £; Ao), 
C(E), respectively, respectively, see Eq. (5.11.4), with values in C4. Using the 
notation of Definitions 8 and 9 and setting 


Pp 
z= |[2%,  forve 2,2 € C(E) (5.11.11) 
j=l 
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(i) Sequences of vectors in C4 {faz {9 }aczt, vezr; {hv}vezre exist 
such that 


F(A) = Sof (A= Ao)’, 


acZt 
g(A,z)= Š, gi) (A-Ao)2”, (5.11.12) 
acZ! veZp 
h(z) = ye huz”. 
vVvEZP 


(ii) Identifying g(A, p) as g(A,z), and h(p) with h(z) for all z = (e**?,..., 
etr), p E€ TP, then 


1 olalf 
(a) _ = 
T> = al gae (Ao): 
1 alal PERE 
(pee eet) -ivp IP 5.11.13 
v a! es FAs | 0, pe (27)P’ ( yo ) 
ivo dP 
_ ip WF 
hp =| h(p)e (Om): 
(iii) Setting 
[Fle = sup |F), Iglo = sup |g(A,z)], [hle = sup |h(z)|, (5.11.14) 


where the suprema are taken over the functions respective domains of defini- 
tion [and Eq. (5.11.2) is used for || -||/, it is 


pee a Flo eloleeo es, Ineo sh. (5.11.15) 


(iv) If the coefficients of the series in Eq. (5.11.12) can be bounded by a 
constant times, respectively, o~'*|, or o—|#!, or e~§!¥I, then the sums of the 
series of Eq. (5.11.12) define holomorphic functions on S,(Ao), C (o, €; Ao), 
C(E), respectively. 

(v) The second of Eqs. (5.11.12) can also be written 


g(A,z) = X gi(A)2” (5.11.16) 
VEZP 


with gu (A) holomorphic inS (Ao) and such that its Taylor series around Ao 
is obtained by inspecting the second of Eqs. (5.11.12) and considering the sum 
over a only. (vi) If f,g,h are real on S)(Ao), So(Ao) x TP, TP for Ao € RF, 
(a) 


=y 


the f® coefficients are also real, while g® and hy complex conjugates to g 
and h_, respectively, and vice versa. 
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Observations. 

(1) Note that the convergence of the series of Eq. (5.11.12) stated in (i) follows 
from Eq. (5.11.15) only if | flo, |glo,e, and [Ale are finite. This is, however, not 
necessarily true in general so that (iii) and (iv) are not reciprocal statements. 
(2) If F is holomorphic in a region W C C4 and w € W, S,(w) c W, and 


if we wish to estimate the derivatives Zw, we can use Eqs. (5.11.15) and 


(5.11.13) as follows. 

Here and below, we regard a matrix-valued function with values on the matri- 
ces x q as a C4-valued function,”° and consider F as a holomorphic function 
on S,(w). To bound the £ x q matrix oF (assuming that F is C’-valued), 
consider the first of Eqs. (5.11.13) and (5.11.15) with |a| = 1. It gives 


fw) <( sup_ |F(w)|)o < wrw, (5.11.17) 


where the second supremum is over W. From this remark, it follows that 


| Of pe [Flo | Og (Oe Idloe 
ðA To- A ES Bo 
Og Idloe € 
=] e << — 5.11.18 
ae |e SEE Z= e-& = I9lo.é 6 ’ ( ) 
22] Sa < |g| a 
Opr oE = k Izy 2,6 > |Jlo,£ 5 
for o' < 0,€' < €, if 6 = €— €. Analogous inequalities hold for the higher 


: 
order derivatives, e.g., lsat lo’ < ale [see Eq. (5.11.9)]. 

These simple estimates will be called “dimensional estimates”. In physics, 
one says that a “dimensional estimate” is any estimate of the derivative of 
a function F at a given point in terms of the function maximum in a region 
divided by the distance of the point to the region boundary (“characteristic 
magnitude of F” divided by a “characteristic length”). Recall that physicists 
rightly believe that all functions (with, possibly, some exceptions) are analytic. 

We now possess the terminology necessary to formulate the analytic im- 
plicit function theorem. This theorem is a particularly simple and strong ver- 
sion of the ordinary implicit function theorem valid when the defining function 
is analytic. It will play a key role in the proof of Proposition 22 which, in turn, 
is the heart of the proof of Proposition 15, p.460, (the KAM theorem) in the 
analytic case. 

The proof of the propositions that follow uses elementary aspects of the 
theory of holomorphic functions and it will be discussed in Appendix N. 
Propositions 19-21 are “analytic implicit function theorems”. 


*° so that |M(w)| = sup,; |Mij(w)I, ||M(w)|| = Di; IMisl. 
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19 Proposition. Let £ > 0 be an integer, Ao ER, and f be a C*-valued 
function holomorphic in the complex multisphere S,(Ao) and real on S(Ao). 
Consider the equation for A: 


A—Ay+f(A) =0. (5.11.19) 


There exists a constant y (one can take, e.g., y = 28) such that if 


y\flo < 1, (5.11.20) 
Eq. (5.11.19) admits a unique solution Ay € So(Ao0), ie. Ai € RE and 


A corresponding proposition can be formulated for equations in T°. 


20 Proposition. Let p, > 0 be integers, let Ag E€ Rf, and 0,£,6 >0. Let g 
be an R? -valued analytic function on Sp(Ao)xT” holomorphic in C (ọ, £; Ao). 
Consider the equation 


yp =pt+sg(A,¢) (5.11.22) 


thought of as an equation on TP parameterized by yp! E€ TP and A E S,(Ao). 
Then there exists a constant y (e.g., y = 2°) such that: 
(i) Equation (5.11.22) is soluble if 


y|glo s e?! <1 (5.11.23) 


and admits a solution of the form 


p= gp +A(A,¢’) (5.11.24) 


with A being an RP -valued analytic function on S,(Ao) x TP holomorphic in 


C (o, &; Ao) : 
(ii) The function A can be bounded as 


|Alo.e—al < Iglee- (5.11.25) 


(iii) The only function inverting Eq. (5.11.22) and enjoying the properties (i) 
and (ii) above is A. 


Observations. 
(1) The reader should note that the above two implicit function theorems 
have “dimensional nature”, i.e., they just say what can be naively guessed. 
In fact, in order to invert an implicit equation “close to the identity” like 
Eq. (5.11.22), one expects to have to impose that the derivatives of g are small 
compared to the derivatives of the identity map (i.e., small compared to 1). 
This is precisely the meaning of Eq. (5.11.23): if we wish to invert inside the 
annulus with external radius et? and internal radius e§—°, we estimate the 
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gradient of g in the region by |g|e¿07te8, see Eq. (5.11.18). For € > 1, this 
is still not the same as Eq. (5.11.23) (while it is such for € < 1). However, if € 
is large, we are asking for the inversion of Eq. (5.11.16) in a very large region 
and extra conditions stem out of the requirement of global invertibility?” (see 
the proof). 

(2) Proposition 19 is an infinite-dimensional version of the implicit function 
theorem, since one can consider all the Taylor coefficients of f at Ao as pa- 
rameters in Eq. (5.11.19). Also, Eq. (5.11.22) is susceptible to such an inter- 
pretation. 

(3) Note that the constant y in Propositions 19 and 20 is Z and p-independent. 
It is also the same in Propositions 19 and 20: but (this has been arranged so 
as to avoid introducing too many (constants). The numerical value of y is not 
optimal. 

(4) Proposition 20 is remarkable because it is a “global inversion” theorem. 
The equation is posed on all of TJ” and not just locally. 


A proposition analogous to Proposition 20 holds for the equation 


w = w + G(w, z), (5.11.26) 


where G is a C*-valued function holomorphic on C(ọ,€; Ao) and real on 
Sp(Ao) x T? C O(o, £; Ao): 

21 Proposition. Let l,p > 0 be integers, Ao E€ RE and 0,£,7 > 0, T <1. 
Let G be an R*-valued analytic function on S,(Ao) x T? holomorphic on 
Consider Eq. (5.11.26) as an equation for w parameterized by w',z. 

(i) There is a constant y (e.g., again, y = 2È) such that if 


1 pi 


y|Gloeg 7 <1, (5.11.27) 


Eq. (5.11.26) is soluble, Vw! € S,(Ao) and admits a solution of the form 


w = w +D(w’,z) (5.11.28) 
with D holomorphic on C(oe77, €; Ao) real on So(Ao) x 7P. 
(ii) The following bound can be put on D: 

üj ae (5.11.29) 


(iii) D is the only function inverting Eq. (5.11.26) and enjoying the properties 
(i) and (ii) above. 

(iv) Fixing w € C(oe77, £; Ao), Eq. (5.11.28) yields the only w € C (o, £; Ao) 
verifying Eq. (5.11.26). 


Observations. 
(1) The above proposition makes sense, and is true, in a natural way if p = 0 


ae Also, it makes a difference to bound 0/0¢ rather than 0/0z for £ large. 
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(just drop everywhere z and the index €). Likewise, Proposition 20 makes 
sense in a natural way if £ = 0 (just drop A and the index o everywhere). 
(2) Setting w’ = 0 in Eq. (5.11.26) as well as p = 0, Ag = O and applying 
Proposition 21, one deduces Proposition 19 with Ap = 0. Since Ap = O is 
clearly not restrictive, Proposition 19 is a corollary of Proposition 21. 

(3) This proposition is clearly analogous to Proposition 20 and has, also, a 
“dimensional nature”, see observation (1), p.484; more generally, the com- 
ments made on Proposition 20 can be repeated with obvious modifications 
for Proposition 21. An analogue of item (iv) in Proposition 21 could also be 
formulated for Proposition 20, but it will not be needed. 


5.11.1 Problems and Exercises 


1. After studying the proof in Appendix N of Proposition 20, find a better value for the 
constant y. 


2. Same as Problem 1 for Proposition 21 and for its corollary, Proposition 19. 


3. Apply Proposition 20 to invert the equation £ = £ — e sin é, € € T!,é€ T}, appearing in 
the theory of the two-body problem, see Problem 13, p.304, 84.10. Here € is a parameter, 
0 < e < 1 (“eccentricity”). Find for which values of € € R the above equation can be 
globally inverted if the estimates in the theorem are applied. 


4. Same as Problem 3 with the new y computed in Problem 1. 


5. Show that the equation in Problem 3 can be inverted for all e € [0,1) in the sense that 
there is a function g analytic on T! such that € = £ — g(£), for each given e € [0, 1). (Hint: 
Do not use Proposition 20 directly.) 


6.* (Levi-Civita) Check that g(£) is holomorphic in the unit disk |n| < 1 if 


def cevy 172? 
14+Vl-& 


(from Vol. 2, p. 321 in [24]). Draw with the help of a computer the curve in the complex 
plane of the e’s such that |n| = 1 and check that its point closest to the origin is imaginary 


and at a distance £z ~ 0.662... It is the radius of convergence in € of g(@). (“Laplace limit”, 


p-304). (Hint: Given € € C the Jacobian of the map is 1 — ecos€ and is 0 for cos€ = 1 


= 
This means that the equation ¢ = ze~£*"§, with ¢ = ett, z = e$, has a solution with £ 
real if |cosé + isinéJe~**™£ > 1. Since cos = 4 sing = +l veZ — 1 this is implied by 


y/1—e2 
n = &— < 1. Check that the singularity of g in e closest to the region occurs for 
1+V1-e2 
e = io and —2 eV 1+2? = 1, which defines the radius of convergence ez, of g in powers 
Q T+ 1402 » W. verg L g pow 


of £, i.e. the Laplace limit.) 
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5.12 Perturbations of Trajectories. Small Denominators 
Theorem 


Another perturbative problem that could be studied is the following. Let 
(A, p) — ho(A) be an analytic Hamiltonian on V x T* which we suppose 
such that the matrix 


Oho 


Moll = 94,04; 
1A 


(A) (5.12.1) 
has determinant 4 0 on V x T“ (“integrable non isochronous system” ). 
Given Ao € V, the torus {Ao} x T° is an ¢-dimensional torus invariant 
for the motion associated with the Hamiltonian ho. The Hamiltonian flow 
on the phase space V x T° induces on the torus a quasi-periodic flow y —> 
Y +wot, t > 0, with pulsations 
Oho def 
wo = ~—( Ao) = w(Ag). 5.12.2 
0= A ( Ao) (Ao) ( ) 
If fo is an analytic function on V x 7%, it is natural to ask whether the motions 
on V x T° associated with the perturbed Hamiltonian, 


Ho(A, p) T ho(A) + fo(A, p), (5.12.3) 


leave a torus invariant, inducing on it a quasi-periodic flow with pulsations 
w(Ao), i.e., “with the same spectrum” as before. One could call this problem 
“the spectrum-conservation problem”. 

Intuitively, one could expect that a torus on which a quasi-periodic mo- 
tion with pulsations w(Ao) takes place will continue to exist but it will be 
“deformed” inside V x T° if compared to the one relative to the fo = 0 case, 
at least if Mo(Ao) is invertible?’ and fo is small. 

This perturbation problem differs from the one of the preceding sections; 
the latter was in fact concerned with the study of the perturbations of motions 
with given initial datum. A whole family of motions is now considered, which 
enjoy a certain common property, namely, quasi-periodicity with pulsations 
w(Ao), and we ask whether a family of motions with the same property still 
exists after perturbation. Proposition 15 of §5.9 provides an answer, in some 
sense affirmative. 

A proposition will now be formulated which, as it appears from the obser- 
vations that follow it, also proves important parts of Proposition 15 and gives 
all the ingredients necessary for its full proof in the analytic case. 

The proof of Proposition 22 that follows is taken from Arnold and is specif- 
ically fit for the analytic case under examination. The analogous proposition 


28 In the case w(A) = wo, VA €E V and, hence, Mo(A) = 0 it is easy to give a coun- 
terexample. Let £ = 1,ho(A) = A so that w(A) = 1 and Mo = 0. Let f(A, p) = £A. 
Then the unperturbed motions have pulsation 1, while the perturbed ones have pulsation 
(l+e)41,ife 40. 
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in the C\”)-differentiable case (with k large enough) is due to Moser, [33], and 
is based on a technically different method. 

Before stating Proposition 22, which will be called “small denominators 
theorem” for reasons manifest from its proof (or “Arnold’s theorem”, [2]), 
some notations are needed, see also Eqs. (5.11.1)-(5.11.3), (5.11.14). 


10 Definition. (i) Ifa € Z4,v¢ 2°, let |a| = Xf lail, |v] = £2; [vil 
(ii) If w € C4, we set |w] =maxicicg lwil, IWIT = Z£, lwil: 

A Lx £L matriz M will be regarded as an element of C% with q = €? so that it 
will make sense to write |M|, ||M||. 

(iii) If f,h are holomorphic in C(o, €; Ao); Sp(Ao) respectively”? and take val- 
ues in C4 let [see Eqs. (5.11.11) and (5.11.3) 


[Flot =sup(A, z)|, F(A, z)|], 
[hle =sup|h(A)|, [Alle = sup ||h(A)|| 
where the suprema are taken over the domains of the various functions. 
The small-denominators theorem can then be formulated as follows. 


22 Proposition. Let ho, fo be two real analytic functions S,(Ao) x TE holo- 
morphic in C'(@0, £0; Ao), Eo < 1. Assume that ho depends only on the action 
variables A in (A, p) € Sp(Ao) x T° and that the matrix Mo of Eq. (5.12.1) 
is nonsingular. Suppose that wo = Sho (Ao) has the “non resonance” property: 


lwo: v|! < Cl’, yvez“, |v|>0 (5.12.4) 


for some C > 0 (“resonance parameter”). Let Eo, no, €o be such that 


ô fo 


1 | Ofo 
TA oie 


+ i oe ae (5.12.5) 


Oho as 
Eo > | FA leto’ mo > ||M5 "llo £0 > | 
Then there exist constants B,a,b,c > 0, only depending upon the number £ of 


degrees of freedom,®° such that if 


de = =¢ 
q = BC 2o (CEp)* (no Boe) 1)*to° <1 (5.12.6) 


one can find in S,(Ao) x T* a torus T (wo) with parametric equations 


A =Ao + alg’), 


1 £ 
yl ET (5.12.7) 
e =p" + Be"), 
and such that: 
(i) T (wo) is invariant for the evolution in S,(Ao) x T! associated with the 


Hamiltonian (5.12.3). On T (wo) the evolution is described by the map 


29 See (5.11.3) for the meaning of the symbols. 
i e.g., a rather rough, though not “totally absurd”, estimate says that one can take a = 
b= 14, c = 2(102 + 6), B = (120)!104" (very far from optimal). 
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yp > yp’ +wt, teRy (5.12.8) 


and is therefore quasi-periodic with pulsations wo. 
(ii) The functions a, 8 are analytic on T! and 


o lale) + B] < a. (5.12.9) 


Observations. 
(1) Using the notations of Proposition 18, p.481, §5.11, Proposition 18, Eqs. 
(5.11.16) and (5.11.12), imply that fo can be written as 


fol(A,z) = X foo(A)z”= YO SE (A-Ao)*z’, (5.12.10) 


veZ" acZ{ vez! 


where fov(A) is the sum of the series in a in the right-hand side of Eq. 
(5.12.10) and is holomorphic in S,(Ao). 
Then the derivatives 0/Oy appearing in Eq. (5.12.5) can be simply defined as 


Ə/ðpr ČÍ i zrð/Əzp (see Definition 9 (ii), p.481). 


(2) It follows from the theory of the Taylor-Laurent expansions for holomor- 
phic functions that if g1, . . . , gr are r real analytic functions on V xT, V C R” 
open, it is possible to find two functions A — o(A), A — €(A) positive and 
continuous on V such that Saa) C V, VA € V, and, furthermore, such that 
gi,-+-;9r are holomorphic in C(o(A), €(A); A), see Definition 8, p.479, and 
l9jlo(a),¢(A) are continuous functions of A in V, j =1,...,r. 

Therefore Proposition 22 could be formulated in an apparently more general 
form by only requiring the analyticity of ho, fo, Mo 1 in S,(Ao) x T* rather 
than their holomorphy in C (o, €; Ao). 


(3) An elementary result of measure theory implies that, given C > 0 and 
supposing V bounded, the set V(C) of the points A € V such that 


Oho(A) ,;— 
pws p MON cot, vee Blu] >0 
has a Lebesgue measure (V(C)) -zyz (V) if Mo(A)~* exists, VA € V; 


see Problems 9 and 10 to 85.10, p.477. It follows that for all A’s outside a set 
of zero Lebesgue measure, there is a number C, depending on A, such that the 
above inequality holds. The question of determining or estimating a number 
C such that C > supyospeze |¥- w|~'|v|~* is, for a given w, an interesting 
and difficult number-theoretic problem. Some of its aspects are discussed in 
detail in the problems of §2.20. 


(4) Suppose fo = Afo, with fo \-independent, and fix a set V C V, bounded 
and closed. Using the notations of the Observation (2) let 
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Oho —1 
Eo = max| DA loa): no = max || Mo llea). 


Then apply Proposition 22 to the Hamiltonian system described by Eq. 
(5.12.3) in S,,(Ao) x T* with Ao € V (C), see Observation (3) above. By Eq. 
(5.12.6), one immediately deduces that for A small, the perturbed Hamilto- 
nian system admits simultaneously coexisting invariant tori T(w(Ao)),V Ao € 
V(C). Such tori will be located geometrically close to the unperturbed tori, 
by Eq. (5.12.9). 

This means that the “less resonant” the pulsations w of the unperturbed quasi- 
periodic motions,*! the larger the perturbations intensity À has to become 
before it can possibly succeed in destroying these motions and the invariant 
tori on which they take place. 


(5) Observations (1)-(4) above show that the statement (i) of Proposition 15, 
p.460, §5.9, and the statement that W€) 4 Ø follow from Proposition 22. 
From the proof of Proposition 22, however, all of Proposition 15 follows with 
some effort, in the analytic case. We shall not discuss this problem, (see for 
instance [39], [12], [21]). 


(6) The condition (5.12.6) involves only the derivatives of ho and fo: this is 
natural since only such functions appear in the equations of motion. 

Also, it should be noted that the nature of the condition (5.12.6) is quite 
simple: given ho, fo, Ao, one can form the quantities Eo, £0, C and, with them, 
the “dimensionless quantities” Ceo, C Eo, nogo ay oe ĉo in terms of which all the 
other dimensionless quantities can be formed. It can be seen that 


CEo > 1, moog Eo > 1, (5.12.11) 


see Problem 1 at the end of the section. Then Eq. (5.12.6) just says that the 
perturbation strength Ceg, has to be small compared to the other “small” 
dimensionless quantities (CE)~!, (noog ‘Eo)~! and £o which are relevant to 
the problem. 

Note that in the above argument, the parameters “relevant to the problem” 
are just Eo, £0, C, 00, 0, no: this is, in fact, not obvious and, a priori, one 


might expect that other quantities may be relevant, like fy = | FB | oo OF 
Fy = pn o; etc. All that the above argument says is that if the results 


of Proposition 22 hold under conditions that just involve Eo, €0, C, 20, £o, No; 
then it is not surprising that such conditions can take the form of Eq. (5.12.6), 
i.e., the simplest imaginable form. 


(7) The condition 79) < +00 or something like it must be necessary: in fact, 
for isochronous systems the above theorem cannot hold. Just consider £ = 


31 i.e., the smaller C is. 
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1, ho(A) = A, fo(A, p) = £A; in this case all the perturbed motions have 
pulsations w = 1 +€ and none = wọ = 1. The parameter 7 will be called 
the “anisochrony parameter” and a system for which 79 < +00 is said to be 
“anisochronous” near Ag. 

The systems of harmonic oscillators are strictly isochronous, and the theorem 
does not directly apply to them. 

However, if ho(A) = wo: A and if wo verifies Eq. (5.12.4), then the theorem 
can still be indirectly applied under some additional assumptions. In fact, let 
fo =Afo with fo A-independent. Assume that 


= 

0 NGAAA 
where foo denotes the average of fo over T“, i.e., its Fourier coefficient with 
v = 0 {see also Eq. (5.11.16)]. Now apply Proposition 17, p.469, to change 
variables completely canonically and to transform the problem into that of 
the analysis of the systems with Hamiltonian 


\ I < +00 (5.12.12) 


ho(A) + AMY fo(A, p), (5.12.13) 


where ho, fj are holomorphic in ae 00; 560; Ao) and in the variable A for A 
close to zero, see Eq. (5.10.27); choose n as n = al + b + £, a and b being the 
constant in Eq. (5.12.6). Also, from Eq. (5.10.41), we see that 


hi (A) = ho(A) + Afoo(A) + A7h(A), (5.12.14) 
where h is analytic in A (actually, it is a polynomial) and in A, near Ao. 


Therefore, if A is small enough, the quantities EG, 76, € such that 


oo eii] 
DADA’ "8P 


Ohl 
Eo 2155 loi no > {IC ' 
5.12.15 


af! 2 of 2 
' N aIo Ayao <) 
Eo Z IJA le0/2, 60/2 + l gp 0/282 + 5 


can be chosen so that for a suitable K > 0, depending on Eo, €o, 00, € but 
not on A, and VA small: 


El <2Eo, nh SOA 40 eh < KAPT, (5.12.16) 


Consider, next, the points Ap € Si, (Ao) with w’ (A9) = oh (Ah) such that 


23 20 
lw (Ab) vl < Cav, Wve Z, |p| >0 (5.12.17) 


Using the results of the Problems 9 and 15, 85.10, p.477 and 478, and the 
estimate on 7), it is possible to see that such points actually exist and fill 
a considerable part of Szo, (Ao) (in fact, their ensemble forms a set whose 
measure approaches that of S1, (A0) itself as C > oo, uniformly in A). 
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Proposition 20 can be applied to hg + \°*+*" ff, regarded as holomorphic on 
C(400, ¢£0; Ao) with Ag verifying Eq. (5.12.17), and Eq. (5.12.6) becomes 


BO) KR rE ON E0)" (20A 405 (2B)? (4e5 1)? < 1 
which can be fulfilled for A small. 
This could be interpreted as saying that the quasi-periodic motions with Aj 
such that Eq. (5.12.17) holds are not destroyed by the perturbation, but sur- 
vive with a slightly modified pulsation (since w’(Aj) = wo + O(A)), running 
on slightly deformed tori. 
(8) So Observation (7) shows that the non isochrony condition, no < +00 , 
can be essentially weakened. One can ask whether this is the case for the “non 
resonance” condition C < +00 as well. The whole discussion of perturbation 
theory, $5.10, suggests that this is not the case. 
In fact, by considering some extreme cases, it appears that one cannot go 
too far toward weakening the conditions. Consider a harmonic isochronous 
resonating oscillators in R3: 


(p,q) = sp’ +q’). (5.12.18) 


and use action-angle coordinates (A, 4) € (R1 0)?3 x T? to describe (most 
of) the motions via the Hamiltonian’? 
ho(A) = Ai + A2 + A3 

on (0, +00)? x T3. A further completely canonical change of coordinates, A — 
A, p > Ẹ: 

Ay =A, +42 +43, =y, 

Ap =Ag, P2 = p2 — Yı, (5.12.19) 

As =As3, P3 = p3 — P1, 
(see Problem 33, §3.11, p.232) transforms the Hamiltonian (5.12.18) into 


ho(A) = 4A, (A,@) € (0, +00)? x T. (5.12.20) 


Let fol, A3, Go, (3) be an analytic non integrable Hamiltonian on V x T? C 
(0, +20)? x T?: its existence is not obvious, but we state without proof that 
it exists and that it can be chosen so that it produces non quasi periodic 
motions.*? Then the system 


ho(A) + €fo(A2, Å, p2, Gs) (5.12.21) 
32 See exercises for §4.1. 


33 An example could be constructed on the basis of Observation (3), p.336, but the discus- 
sion is quite long. 
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cannot be integrable as, manifestly, the coordinates corresponding to the de- 
grees of freedom with indices 2 and 3 verify the equations with Hamiltonian 
Efo which gives rise, for e Æ 0, to motions coinciding with those of fo, up to 
a change of scale in time and which are not quasi periodic, i.e., not integrable 
by criterion (i), p.353. 

The example shows why resonances can be important. In a resonant situation, 
it happens that some degrees of freedom of the system “do not move at all” 
as can be seen by suitable changes of coordinates. Hence, upon perturbation, 
their motion will be entirely governed by the perturbation and it will therefore 
become important whether or not the perturbation by itself is integrable. 

If the perturbation by itself describes an integrable system in the phase space 
region around a resonant torus of the unperturbed system, the above argu- 
ment suggests that something could, nevertheless, be done. This is in fact the 
situation found in celestial mechanics in the vicinity of the unperturbed tori 
corresponding to orbits of small eccentricity and small inclination. As shown 
in §5.10, Observation (3), p.473, in this situation one can set up some per- 
turbation scheme to compute the secular perturbations. The scheme can lead 
to a rigorous proof of tori conservation (under suitable assumptions on the 
phase-space region which is considered). This proof is in a celebrated paper 
by Arnold, [3]. 

Of course, in the above discussion, one could have directly started from Eqs. 
(5.12.20) and (5.12.21), but we thought that starting from a physical system 
would be easier for the reader. On the other hand, the choice of RÌ is essen- 
tial to the argument: if we had chosen R?, the argument could have failed 
since only A2, 2 would have been present, i.e., fp would have described a 


one-degree-of-freedom system (which is “necessarily” 34 integrable). 


(9) The above observation shows that the non resonance condition is essential 
in a case in which the resonance is very manifest, i.e., the unperturbed system 
is isochronous and resonating. However, one could think that the non inte- 
grability phenomenon might be only related to isochronous resonances: if a 
system is anisochronous one might argue that the perturbation will cause the 
motion to wander around in phase space, keeping it away from the resonances 
most of the time. The fallaciousness of this way of reasoning is made clear by 
an example that goes back to Poincaré. Consider the system on R? x T?: 


(Aj + A3)+ef(¢1,¢2), with (5.12.22) 


NI = 


H(A, Ao, 91, p2) = 


2m 
e d 
g(p2 — p1) “ef Flori +Y, p2 + we = not constant (5.12.23) 
0 


To fix the ideas we shall take f(y~1, p2) = 1 — cos(wy2 — v1): in this case Eq. 
(5.12.22) has a simple physical meaning, as it describes two points ideally 


34 See the statement (19), p.363, and §2.7 for general conditions of integrability. 
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bound to a unit circle attracting each other via a harmonic force. The reader 
should, as an exercise, understand the physical meaning of the argument below 
and why it can be immediately extended to the general case if (5.12.23) holds. 
For £ = 0 all the motions on the torus {Ag} x 7?, Ao = (1,1), are periodic 
with pulsations wọ = Ag = (1,1), so the torus is resonant. 

Suppose, per absurdum, that the torus is not destroyed for small €, in the 
sense that there exists an invariant torus (i.e., invariant with respect to the 
perturbed motion) with parametric equations: 


A =Ao + alg’), 
e =p + aly’), 


where a, 3 are R?-valued functions in C% (T°), and that the torus given by 
Eq. (5.12.24) is close to the unperturbed torus for € small: 


(5.12.24) 


(e) = max |a(¢')|o9 * + max |B(~')| <9 0- (5.12.25) 


e—0 
Suppose also that the motion on the torus in Eq. (5.12.24) is described by 
gy’ > p" +wt, w = (1,1) = Ao, i.e., assume that the perturbed torus is run 
periodically with the same spectrum as that corresponding to the unperturbed 
torus {Ao} x T?. Write the Hamiltonian equations for ho + €f and subtract 
the two equations for A; and Ag: 


Ag — Ai = —2esin(y2 — y1). (5.12.26) 


Then integrate both sides between t = 0 and t = 27, assuming that to have 
computed them on a motion developing on the torus of Eq. (5.12.24) with 
initial datum corresponding to y’ € T?. Since the motion is periodic, by 
assumption, with period 27 (wo = (1, 1)), it is 


2T 
0= -2e f sin(y2 — yı )dt (5.12.27) 
0 


27 
=- 2e f siny — p1 + ba(lp1 +t, pa +t)) — ellyr +t, p2 + t))] dt 
0 

= — 4me([sin(yy — ¢'1)] + 24(€)) ap ~ —4me sin(ys — p3), 
where ¥(e) € [—y(e), y(e)] is suitably chosen. This is absurd if p4 — y1 Æ 0,7 
and shows that the torus of Eq. (5.12.24) cannot exist as an invariant torus 
run periodically with pulsation wo = (1,1). The resonating torus correspond- 
ing to Ap = (1,1) is “destroyed” upon perturbation, no matter how small. 
The argument shows that the torus is “destroyed”, but does not show that 
all the periodic motions with period 27 are destroyed. For instance if y1 = ye 
or p1 = p2 + T we form, together with A, = A> = 1, two sets of initial data 
evolving periodically with period 27 and, topologically, such sets are two cir- 
cles (i.e., like 71! instead of 7°). 
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This example is interesting because it considers a case in which all the as- 
sumptions of Proposition 22 hold except the non resonance condition (5.12.4), 
thereby showing its necessity. However, it does not provide an example as 
“shocking” as the one of observation (8), since the perturbed system still ex- 
hibits only quasi-periodic motions or motions with rather trivial asymptotic 
behavior.?° Much more interesting in this respect would be the case when f 
in Eq. (5.12.23) is replaced by a function really depending on both 1, and 
p2, not only on y2 — yı. In such a case, one expects to find some motions 
with very complex asymptotic behavior near a resonating unperturbed torus. 


(10) Observations (7)-(9) above clarify the necessity of the assumptions in 
Proposition 22. They can be summarized as follows: non resonating quasi- 
periodic motions on ¢-dimensional tori are preserved, in anisochronous sys- 
tems, in the presence of small perturbations; they are also preserved in 
isochronous non resonating systems for all the non isochronous small per- 
turbations (modulo a small change in the frequencies). Resonating motions 
on ¢-dimensional tori are generally destroyed by small perturbations in both 
the isochronous and the non isochronous cases. 


(11) It is important that the reader who is about to read the following proof 
realizes that all the very numerous inequalities that will be met can easily 
be guessed on “dimensional grounds”, i.e., using what we called in §5.11, 
Observation (2), p.483, “dimensional estimates”. In this way, one can easily 
check the calculations (which we give in great detail only for completeness 
since this book is supposed to be elementary). 

The possibility of simple dimensional estimates is what makes the proof in 
the analytic case easy to visualize. 

In the upcoming proof no attention is paid to optimal estimates, nor to the 
evaluation of the various constants. However, in principle, the proof below 
does not contain any crude approximation, and if the constants are evaluated 
with care it should give results which are optimal in the given generality of 
the assumptions. 

This, of course, does not mean that in particular cases the estimates could 
not be greatly improved. 

Finally let us point out to the reader familiar with present trends in statistical 
mechanics and field theory that the proof below yields a nice example of a 
vast class of theorems which can be proved by what has become known in 
physics as the “renormalization group method”. 


PROOF. We think of the unperturbed Hamiltonian ho and the perturbation 
fo as a pair of holomorphic functions on C (o0, £0; Ao) C C%, real on Sp, x T, 
see Definition 9, p.481 487, §5.11. To this pair we associate the “characteristic 
numbers” Eo, no, 00, £o, €o verifying Eq. (5.12.5). 

We have already noted that [see Eq. (5.12.11)] 


as can be seen by the completely canonical change of variables A = (A1 + Ao), B= 
$(A1 — A22), Y = G1 + p2, V = Y1 — p2, (exercise). 
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moog Eo>21, CE) > 1. (5.12.28) 


In the course of the proof, we shall have to “give up” some analyticity in the 
A and ¢ variables in order to make dimensional estimates. The amount of 
analyticity that is given up is, to a great extent, arbitrary: we introduce some 


“analyticity loss” parameters ôo > 6, > ... which will be used to describe 
precisely the analyticity loss. To be definite, let 
1 & 
= ——_> _ 5.12.29 
ok = y (1+k)? ( ) 


so that 5 $z o ôk < ĉo < 1. For simplicity, assume that CeoE < 1. 

The identification of y = (y1,...,~e) € TE with z = (e91... ett) € Cf 
will be often used, while also freely using an “angular notation” for z even 
if z is not on the product of the £ unit circles. In this case, 0/Oy, means 
izkð/ðzk, see Definition 9 (ii), p.481. Also, it will be convenient to write 
eA = (eifi... e^) and ze*4 = (zett, ..., zett) for A € Ct. Such 
conventions greatly simplify the notations. 

The proof proceeds by applying perturbation theory along the lines of 
§5.10. Since the first problem is that fo does not fulfill the assumptions of 
Proposition 17, we shall divide fo into two parts: one very small O(<2) and 
the other fulfilling the assumptions of Proposition 17, i.e., with only finitely 
many Fourier components (“Arnold regularization” ). 

Then we shall apply Proposition 17 to find a canonical transformation 
changing the Hamiltonian into a “renormalized” one with an integrable part 
hı (A) plus a perturbation f(A, y) with fı of O(e2)). Afterwards, we proceed 
to find a point A; such that ue (Aj) = wo, and we shall again be in a position 
to begin the procedure all over again, provided we control the new charac- 
teristic parameters Fy, €1, 01, £1,171. Basically, the whole argument is reduced 
to searching for an expression of Fy, €1, 01, £1,71 in terms of Eo, £0, 00, £o, No 
(“Kolmogorov’s iteration” ). 

To reduce fo to a trigonometric polynomial plus a small remainder, intro- 
duce the “ultraviolet cut off”: 


2 1 
No = 4 log —— > 1 5.12.30 
0 ôo °8 Ced6 ( ) 

and define the “regularized perturbation” 
FUAS SS ma (5.12.31) 

vEZ?, |V| <No 
using for fo the notation of Eqs. (5.11.16), p.482. Let fl>Nol & fy — fN, 


The choice of No has been made so that fr is indeed of O(<2). This 
can be seen by applying the estimates of Eqs. (5.11.15) to the functions oh 


5.12 Small Denominators 497 


and o9 32, holomorphic in C(@0, 0; Ao), regarded as function on C(&) pa- 
aired by AE Stas YVE Z£ Vi=l,...,¢, 

ð fov 

ie f a | S £0 Cw, Wiha < Eo go 67%], (5.12.32) 


by the third of A (5.11.15). Also note that by item (v) Proposition 18, p.481, 
the functions Ofov(A) and vi fo (A) are Vv € Zt, Yi = 1,...,l, holomorphic 
on S» (Ao). Just apply (v) to the functions g = Sh and g = fo, 


Equation (5.12.32) allow us to bound FPN an d fN as ine There 
exist Bı, Bo, 1 << Bı < Bo such that 


ars 1 afln <B j= 
| OA M T oal ð EE S D1E000 ; 
H i (5.12.33) 
afeNe 1 afleNel 
| OA ai s 00 dy lee? £o—59 = < BoeGC. 


These estimates follow by substituting the bounds given by Eq. (5.12.32) into 


Eq. (5.12.31) or into the analogous expression for po after the appropriate 
differentiations. For instance, consider the second of Eqs. (5.12.33). One has, 
V (A,z) € C(00, £0 — 50; Ao), 


rn OA OA 
vezt vez? 
lv|>No |v|>No 
< P gezag FM Yo eg e za (5.12.34) 
vEZ?, |v|>No veZ! |v|>No 
1 
1 +e» 
ieee) 2B ee 
1 + e+2% 


where in the first equality, we use the symbolic but suggestive “angular nota- 
tion” for z, and B’ > 0 is a suitable constant. Similarly, 


foo” (A, z) 
Op 


=| Sov fov(A )2*| < eooo Se Ml < BICe a0. 


ve zt |v|>No 
lv|>No 


(5.12.35) 
Hence, the second of Eqs. (5.12.33) follows from Eqs. (5.12.34) and (5.12.35). 
The first of Eqs. (5.12.33) follows from the same type of arguments.°*© 


36 sg ae £ be = 350 4Jve 
One could take, say, Bı = Bz = 2(4,/e)*, because + ae rs <3 if ôo < 1. 
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Following the ideas of perturbation theory, a canonical change of variables 
will be constructed using, as in the proof to Proposition 16, §5.10, p.466, a 
generating function of the form (A’,y) > A’- p+ o(A', p), where go is 
defined on a suitable set Sy x T £ as 

fou (A‘) 2” 
P(A’ = c .12. 
o(A’, p) 5 aAa (5.12.36) 
vEZ!,0<|v|<No 
which defines a holomorphic function of (A’,z) € C (Oo, £o — 60; Ao) if do is 
chosen so small that, by consequence of Eq. (5.12.4), |w(A)-v| > OVu € 
Zí v #0,YA€ S, (Ao). Actually, a simple choice for 09, good enough for 


our purposes. In fact, Y A’ € Sy (Ao) and if 09 < 400, it is 


\w(A’) -v| = |(wo + (w(A‘) — w(Ao))) v 


Sira | — KAA = eo) ot 


lwo- v] (5.12.37) 


< C [vf |1 — 20pm | 
20 


because we can bound |w(A’) — w(Aog)| as 


jw(A’) — w(Ao)| = | i dt Lolan te Ao))| 


=| i a (s Orolo + ICA Ao) (Aj — Aoj)) dt (5.12.38) 
7 j=l j 


zy fn 


= 0o Š 22E 
20 — 00 20 


by a dimensional estimate like the first of Eqs. (5.11.18); hence, if 


~ def 00 
ye AEN 5.12.39 
00 UCENE ( ) 


then, since CEo > 1, No > 1, o < 400, that Eq. (5.12.37) implies 
lw(A') v|} < 2C|v|f, Y0 < |V| < No (5.12.40) 


for A’ € S (Ao): Hence, Eq. (5.12.36) implies that o is holomorphic in 
C (ŭo, ĉo — 59; Ao) and that, using the second of Eqs. (5.12.32),37 


37 Recall that v|,v € Z! and |wl,w € C%, have a different meaning by our conventions, 
Eqs. (5.11.1) and (5.11.2). This explains the factor £. 
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A’)| 
@(A’,z)| < | fou ( e(So-50)|¥| 
\Po( A’. eon sa) al 


< >> 2Clv|*|for(A eo 2)" (5.12.41) 
0<|v|<No 
< X, 2C|v|*-! reo ane!” < BzeoC a0d9 T" 


0<|v| 


for all (A’,z) € C200, £o — ôo; Ao), with B3 > 2.38 
Hence, by the dimensional estimates of Eq. (5.11.18), 


op = Re 
— ee <2B3e0C 0059 aes Oo t 
OAs (5.12.42) 
OPo a 
—|~ <B3eoC 0069 21551. 
Op aut R 9 
Therefore, it makes sense to consider the map 
op 
A =A' + zo (Az), 
P (5.12.43) 


~ 


Po ; 
jar AD) J ad beer 
defined for (A’,z) € C(@o, ĉo — ôo; Ao) with values in C?“. Here we regard the 
second of Eqs. (5.12.43) as the complex version of 


Zj =2; exp (i 


S 


a 
p= + F (A’,z). (5.12.43’) 


Now the problem arises of inverting the first of Eqs. (5.12.43) or the second 
of Eqs. (5.12.43) in the respective forms 


A’ =A + Z'(A,z’), 


y : (5.12.44) 
24 =z; exp (i A;(A’,z)), He (Lore) 
where the second should be regarded as the complex extension of 
p= +A(A',¢’), yp eT. (5.12.45) 


For this purpose, we use, respectively, Proposition 21, p.485, and Propo- 
sition 20, p.484, §5.11 (choosing, say, 7 = log2). They guarantee that the 
above inversions can indeed be made in the desired form, via Eqs. (5.12.39) 
and (5.12.42), if 


1 
PECE” = BaeoC ECNE! S7” <1, (5.12.46) 
20 


z e 
38 Using Z, |v|te-9l4| < maxyso(y%e79¥/2) ye 8IMI/2 < maxysq(yte-¥)(2)2 AE 
one can take, say, B3 = €!2°(4,/e)*. 


500 5 Stability Properties for Dissipative and Conservative Systems 


where By is a suitable constant determined by imposing Eqs. (5.11.23) and 
(5.11.27).3° In this case, Æ is holomorphic on C(4 00, o — 260; Ao) as well as 
A and they verify the bounds 


5 2o 
eee 8 (5.12.47) 
IANS, /2,¢0 250 <2B3E0C 0059 t103" < ðo, 


| | 
m 
— 


—2é 
50 <B3£0C 0059 5 e76o < 


where the first right-hand-side inequalities follow from Eq. (5.11.29) or Eq. 
(5.11.25), while the second right-hand-side inequalities follow, if B4 is chosen 
as in the footnote,°? from Eq. (5.12.46). 

Eq. (5.12.47) permit us to define on C($00, ĉo — 209; Ao), say, the functions 


=(A’,z’) = 0a! zeian), 
ea (5.12.48) 
AA, 2) =a (A + F'(A, z),z) 
and, by Eqs. (5.12.42) and (5.12.39) they verify 
z~ ~ 26 226 2o 
l= 15, /2,¢0-380 <Bse0C 00% pre a 8° (5.12.49) 


IA 14,625) <4¢B3e0C E00 Ngt 8g t < bo. 
Therefore we define the maps Co, on (A’,z’) € C($00, £o — 260; Ao), by 


A =A’ + Z(A',z’), 


ee ys (5.12.50) 
and Co; on (A, z) < C( $00, £0 a 260; Ao), by 
Al =A + F'(A, z), R 
ep 12. 
g =z, cia (Az). 


which have the properties [by Eqs. (5.12.47) and (5.12.49)] 


39 One can take B4 = 4ye? B3é. Note that, not surprisingly, both inversions require the 
same condition up to a constant factor, adjusted in Eq. (5.12.46) to be the same. This is 
basically so because the implicit function theorems impose conditions on 0769 /0A/d~ 
for the first inversion or on 0769/0 p~0A! for the second. 

Actually, it would be easy to check that Eqs. (5.12.42) and (5.12.46) automatically imply 
that the matrix Jij = 6;; + Abo /OA', y; is invertible in C(¥00, ĉo — 469; Ao) if B3, Ba 
are chosen as in footnote 37 to p.498 and as above. Hence, the general theory of the 
canonical transformations, §3.11, Problems 9-11, p.228, shows that under the condition 
of Eq. (5.12.46), the map of Eq. (5.12.43) locally generates a completely canonical trans- 
formation defined on Siz (Ao) x T“, changing (A’, y’) into (A, 4). This map is actually 


a globally canonical map, as we shall see. 
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~ 1 Ty 
Co,Co : C (70> £0 — 350; Ao) = C (520; 0 — 280; Ao), (5.12.52) 


Hence, it makes sense to consider CoCo and CoC on C(400, £0 — 300; Ao). 
By construction, Co and Co are inverses of each other: 


CoCo = CoCo = {identity map} (5.12.53) 


on C(400, ĉo — 350; Ao). 7 
It follows, by the general theory of canonical maps, that Co and Co are 
completely canonical, inverse to each other, maps of S oes T“ onto its image. 
4 
If a motion takes place in Co (Siz, x T!) it can be described in the (A’, ¢’) 
4 
variables as a motion generated by the Hamiltonian: 


H(A, g’) = ho(A’+5(A’/))+fo(A'+ 5(A/!),@'+ ACA’, g’) (5.12.54) 


which, following the perturbation theory and Y (A’,z’) € C(400, 0 — 360; Ao), 
we write as 


Fy (A’,g’) ={ho(A’) + foo(A’)} 
+ {ho(A! + B(A’,y')) — ho(A’) 

att fo(A’ F Z(A', g"), p T A(A’, ¢’)) = foo(A’)} 
“ (ny(A)} + (A(A g) 


where hı and fı are implicitly defined, respectively, as the first and second 
curly-bracket terms in the intermediate equality in Eq. (5.12.55). 

We shall henceforth regard C (400, ĉo — 369; Ao) as the domain of definition 
and holomorphy of hı and fı: however we shall further reduce it, later, for the 
purpose of using dimensional estimates or for other needs. This basic choice 
of domain is convenient since we control well Co on this set, see Eq. (5.12.52). 

Our next task, according to the program of the proof, is to find a point 
ALE S15, (Ao) such that 


(5.12.55) 


dhy (A) 
OA 


Recalling that wo = hoto) this equation can be elaborated as 


= wo. (5.12.56) 
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w(A’) — w(Ag) + 200 can = 0, => M(A—0)(A’— Ao) 
+ [w(A‘) — w(Ao) — M(A — 0)(A’ — Ao) 
O foo A _ 
taAa Col (5.12.57) 
> (A' — Ao) + M(Ao)7![w(A’) — w(Ao) 
M(A — 0)(A’ — Ao) 4 2 hoo (A’)]=0 


= (A’— Ao) +n(A’) =0 


where n is defined on Sy (Ao) by the term within square brackets in the third 
relation. 

Apply Proposition 19, p.484, to the last equation and deduce that if 
y|nlo < o for some 9 < +00, then the equation admits a unique solution 
A, E€ S,(Ao). Hence, we must estimate |n|, for 0 < 400 < $00. For this 
purpose note that 


E 
—— < sm), 
(00 — 0) 20 

having estimated the second derivative of w by a dimensional estimate; see 


Eqs. (5.11.9) and (5.11.18). Hence, if 9 < 40o and if the first of Eqs. (5.12.32) 
is used with v = 0: 


< l2 


In|o < no (80 Bo(--)” + £0) (5.12.58) 
0 


so that if we choose (recalling that C'Ep > l, Ceo < 1) 


it is |n| < 2£0ņ0. Applying Proposition 19, p.484, and if 2ynoeo < +00, then 
Eq. (5.12.56) admits a solution Ay € S15, (Ao). The condition 2yno£o < $00 
8 


becomes, via the expression for go in Eq. (5.12.39), 16ye0C (nap Eo) Not! < 
1; and it can be implied together with Eq. (5.12.46) by requiring 


BseoC EoC (mooz Eo) Nét!” < 1, (5.12.59) 
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having used Eq. (5.12.28) and having chosen Bs suitably, B5 > 4. Hence, if 
Eq. (5.12.59) holds, the Hamiltonian h, and its perturbation f, in Eq. (5.12.55) 
can be considered as functions defined and holomorphic in C (do, &)—3609; A1) 
with A; so chosen that Eq. (5.12.56) holds. 

The argument can now be iterated. In fact it is possible to associate with 
the Hamiltonians hy, fı in C( 400, £o — 469; A1)*! the characteristic parame- 
ters 01 = +700, £1 = éo — 469 and Fy, 7, €1, with E1, N1, €1 estimates of 


Oh, 07h, —1 ofi ofi 

mle aa A e rla 
To find E1, M,€1, we apply, as usual, some dimensional estimates. The Fy 
estimate is based on the first of Eqs. (5.12.32): 


Ohy Oho Ofoo 
IJa OA! OA’ 

The 7 estimate is based on the dimensional estimate, Eq. (5.11.18), for 

o;(A) = 0? /AA\OA’, as 04;(A’)| < a = 20 WA! € Sa (A1); in fact, 


T a 
720 


kl 


1 
+—|| 
01 


(A) =| 


(A’) + (A’)| < Eo + £0. (5.12.60) 


M,(A’)~! = (Mo(A‘) + 0( A’)? = (Mo(A‘)(1 + Mo(A’)“t0(A’))) 


=(1+ Mo(A')~!0(A’)) "Mo (A‘)7! 


(5.12.61) 
=Mo(A’)~*) + [(1 + Mo(A’)~10(A’)) 


t= 1) My(A!)7) 


and (since given two Z x L matrices R and S it is ||RS|| < ||R|| ||S|| and 
IIG + RB)? — 1] < 2||RII, if RI] < 5) we see that ||Mo(A’)~'o(A’)|| < 


2 
2eom0o < 4 [by Eq. (5.12.59)] and 


M (A^|| < m0 + 4e0n§ 00°: (5.12.62) 


The estimate of £1, is slightly more complicated because it involves derivatives 
of Æ and A: which, however, can be estimated dimensionally. We first elab- 
orate the formal expression of fı by adding and subtracting suitable terms: 


40 e.g. one could take Bs = 2B416y = 23761(8,/e)’ < 2324112% if y = 28, 

al We further restrict the domain in which we consider hı, fı to be able to perform dimen- 
sional estimates later. 

42 See Appendix E, Eqs. (E.2) and (E.10), p.523. 


504 5 Stability Properties for Dissipative and Conservative Systems 
f(A’, z) =ho(Al + B(A’,2’)) — ho(A’) 
+ fy (A! + B(A’, z), z'ea) foo( A!) 
+ fy (Al + E(A',z') 
={ho(A! + E(A’,2’)) — ho(A’) — w(A')E(A’,z’)} (5.12.63) 
+ {foo (A! + B(A',2!),2/e'4"#)) — foo(A’) 
+ w(A')E(A’,2’)} 
pp E(A', z’), z êA) 


where the addition and subtraction of w : Æ is suggested by the formal per- 
turbation theory and by the fact that, if (A, z) = Co( A’, z’), it is B(A’,2’) = 
Po (A’,z) so that the various terms in curly brackets formally have size O(eĉ). 
In fact, the first term is manifestly of O(||?) and £ is formally of O (£o); the 
third term is by construction of formal order O(eĝ), while the second term 
can be rewritten as 


fim (A! + B(A’,z’), z/ ei A(A’2')) E JEN (A', z'etAlAA z’) (5.12.64) 


because, by the definition of Do, 


w(A'): F(A’, z’) = w AN EEA, pA’, z) = A, z- foo(A’)) 


and, therefore, Eq. (5.12.64) is formally of O(eo)O(Æ), i.e., O(eł) (here we 
use that z’ e'4(42") = z, also). 

Writing the three terms in curly brackets in Eq. (5.12.63) as fë, fd", fg 
respectively, we now show rigorously that they have the right order of magni- 
tude. Dropping the (A’,z’) in the arguments of for simplicity, and using Eq. 
(5.12.64) and the Taylor-Lagrange formulae, we find 


1 ty d2 
=f ay | dtə gA tt =) 


L ah Pe: 
zi: g JAA. (A’ +t2)5,5)), 


12. 
if I dt 4 EO at 4 te ee) (5 65) 
1 £ [<No] 
o fo 1 iA) z 
=| (5 JA, (A'+tE,z'e =), 
j=l 


soe =flNol(a’ + te Ze), 
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Bounds for fé, fd”, f1! can now be found by dimensional estimates. Com- 
bining Eq. (5.11.18) with Eq. (5.12.33), the second of Eqs. (5.12.47) and the 
first of Eqs. (5.12.49), implies all the following inequalities except the fourth: 


< 2K 

OAOA l ~ ə — 00 ` 020 > 

Oho 2! Ey p. 
JAJAA Sew Shr 
JAOAOA 20 = (00 = 00)? e 029 
afno 2 

< Byend 

OA. ‘lege oo (5.12.66) 

N 
Ie dela < CBoezC 00, 
=) zi = Oo 
|| 3 50,€0—360 < BzeoCa069 “e 2 < 2 
Al 15, 0-350 Š Al Bze0oC EoC NEH! 6g” T" < ôo. 


To prove the fourth inequality, make use the second of Eqs. (5.12.33): 


a= D2 wal Y oet 


|v|>No |v|>No 
< Dy Mille M < coe DY) eM (5 19.67) 
|v|>No |v|>No 


1 1 
<eqgole 24% X- e~2%l¥l < BolegCeo 
|v|>0 


in C'(@0, ĉo — ôo; Ao) [see also (5.12.35)]. 

For (A’,z/) € C(400, ĉo — 360; Ao) it is (A’ + t&,z’e’4) € C(400, £0 — 
259; Ao); so we can insert the bounds of Eq. (5.12.66) into Eq. (5.12.65), using 
e£ < e < 4 for simplicity, to obtain 


AEEA <2E 05 L (Bze0C 0089 e?) 
< 2° BZ l eC EoC 0083“, 
TAY, te <24B Bol 20873! (5.12.68) 
Ifi 133, 60-360 S4 1436E90 0g 20, 
fi Fe ua <BolerC o0 
so that 
[Filiz ¢0-36) £ BoeoCEoC8 * 00, (5.12.69) 


where Bg is suitably chosen.** 


33 e.g. B7 = 29 Be = 290? B2 < 24402 (1)229. 
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Next note that A, € Siz (Ao) so that C(o1,&1; A1) C C( $00, £0 — 


459; Ao) so that the boidar of C(o1,€1;A1) is quite far from that of 
C( 400, ĉo — 360; Ao) and Eq. (5.12.69) yields a dimensional estimate of 


~ def Of of 
a= supap A ’ Z)| +155 


where the supremum is taken over C(01, £1; A1). It is given by 


(A’,z’)| (5.12.70) 


1S liliz 320:0- sega 
(5.12.71) 


£41 5—46— 
<< 00 alt ae BregC(EoC)* No" o 
recalling that 01 = +700 and suitably choosing B7.*+ 
So, collecting all the above inequalities (5.12.71), (5.12.62), and (5.12.60) 
and the definitions of 0),&,, the following quantities can be taken as charac- 
teristic parameters for the Hamiltonians hj, fı in C (01, €; A1): 


-— IV _ _ y= ae thts 
01 320EyC NET O T So E Cand’ 
&1 =o — 460, 
aR + £0, (5.12.72) 


M =no + 4eonkoz 1, 


1 
ET =BgCed(Co)? (log Ceodh 
having replaced No in Eq. (5.12.71) with its expression in Eq. (5.12.30), and 
Bg = 2°*'B; provided the condition of Eq. (5.12.59) holds. 
Consider now the mappings Kn : (On, En, En, Tn; En) Ta (On41, Ents En+1, 
Mn+1;€n+1) defined by Eq. (5.12.72) in which ĉo is replaced by 6, and 0 > 
n, 1 — n + 1, forgetting (temporarily) the condition of Eq. (5.12.59). Then 


Merge 


hl 


(Ons En; En, Nn; En) = Kn-1 vow Ko(@0, £o, Eo, no, £0), (5.12.73) 


and it becomes possible to check that if Ceo is small enough (so that the 
inequality in Eq. (5.12.85) below holds), then: 


44 e.g. By = 29 Be = 21844092 (¢1)2, 
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En > boo = 0-4 4; 
j=0 
En <2Ep, 


n 


(egC)2+3)” < EnC < (e00), 


ees eee 
(EoC)” [65 (n!)224”? (log(Ceo)-1)2"] 


On = 
An inductive proof of the validity of Eq. (5.12.74) under a condition of the 
form of Eq. (5.12.85) is described below, between Eqs. (5.12.75) and (5.12.86), 
for completeness. The reader should, however, first realize that Eqs. (5.12.74) 
and (5.12.86) are quite obviously valid under a condition of the type of Eq. 
(5.12.85) below. 
The first inequality follows from our choice of 6;. The second and third 
follow from the last two if, say, 


Co 


1 3)\n 1 
E0 < 3 Fo, 2 (Ceo) a < 2; eonoog | < g’ (5.12.75) 
Y (CE0) È" | (Hoc) E7" (n!)?23"" (l0g(Ce0)71)?"] H] < 210g2 
n=0 


The fourth inequality in Eq. (5.12.74) is proved by remarking that, for æ < 1, 
it is SUPo<g<1 T° (log 4)“+! < a~t (€+1)!, Va > 0; hence from Eq. (5.12.72), 


EnC > Bg(C E0)? (eEn-1Cfn-1)7 0 2G 
; - peter (5.12.76) 
EnCd§ <Bg(2CEp)* (En—-1Cbn-1)°6, T 23 t (2+ 1)! 


where the ratio s has been bounded by 2~* or 1 (below or above) and we 


have applied the above elementary inequality with a = 4. 


3 
Since Bg is very large, e.g., Bg2~* > 1, and CEp > 1, g6 > 1 we 


conclude from the first of Eqs. (5.12.76) that 
e400, > (en-108f 1)? > (e0085)?” (5.12.77) 


which implies the lower bound in Eq. (5.12.74) for Cen, if Ceo is small enough. 
And by the explicit expression (5.12.29) for ôn, the condition turns out to be 


(Ceo)~ 468 > 1. (5.12.78) 
By Eq. (5.12.29) the second inequality in Eqs. (5.12.76) gives 
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Geo. < (Ceo) ®" II 
k=1 
Gyro 


- [Bs(2CEC)e" + (ya +n— hye] 


= (Ceo) ©?" [Bs(2CBoC)e"t (e + (£ 


Ba (p-d (5.12.79) 
16 


as n—1 
12¢-44)()"-1 OR EG)" log(1+k) 


(3)" 


a 
< (Ceoð6 (E00)? E Bo) 


if Bo is suitably chosen.45 Since for all n > 0, 66>” < ôn, Eq. (5.12.79) implies: 


2 


(Cen) E [(Ceo)*~( 3)" (EoC)*£5°*-3 Bo] (3) 


‘ (5.12.80) 
<(Ceo) 4) 
provided (the worst case being n = 1) 
(Ceo) ™ (EoC)*€5 Bo < 1. (5.12.81) 


Finally consider the last of Eqs. (5.12.74). By the recursive definition of 
On, 


(dn—1 sv o) tt 


AET E L E (5.12.82) 
(25 EoC A)r E+1) ( Il- log (Oege) ) 
By Eq. (5.12.77) and the explicit form of ôn, this becomes 
1 nl-2en t+ 
n 2 OS | SO 5.12.83 
Q Qo (4256+10 FoC)” e ses | ( ) 


So if Ceo is small enough, the last inequality in Eqs. (5.12.74) holds. More 
precisely, it holds if 


(log(Ceo)-1)™* 
26 + 11 


Note that the conditions (5.12.84), (5.12.81), (5.12.78), and (5.12.75) can all 
be satisfied by imposing a single condition which will also imply Eq. (5.12.59): 


Ceo < 5, tai: (5.12.84) 


45 e.g. Bo = 8Bg32 C+D (¢ + 1)13 26 exp{ $ (122 + 4) 5, 59(3)" log(1 + h)}. 
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BiocoC(EoC) (Eoo ES? < 1, (5.12.85) 


where Bjo is a suitable constant and so are @,b,t > 0. And if Eq. (5.12.85) 
holds, then, Vn > 0, the analogue of Eq. (5.12.59) 


Bee OFC (aon) NE), <1 (5.12.86) 


holds if Ceo is small enough. In fact, Eq. (5.12.85) implies Eq. (5.12.74), as 
just shown, and Eq. (5.12.74) inserted into Eq. (5.12.86) just gives a condition 
like Eq. (5.12.85) with possibly new values for the constants B10, 4, b, ©. 

So a condition of the form of Eq. (5.12.6) guarantees that the sequence of 
numbers (on, En, En, Mn, En) recursively defined in Eq. (5.12.73) verifies Eqs. 
(5.12.74) and (5.12.86) as well. 

This means that under the condition (5.12.6), with B, a,b,c suitably cho- 
sen (and £ dependent), it is possible to define a sequence of completely canon- 
ical transformations, Co, C1, . . ., having the form 


A =A! + 2 (A’,2’), 


gf gt OMA (5.12.87) 


Z 


and such that, Vj = 0,1,..., 


Cj : Coj, j+; Agta) > Cloj, Ej; Ay) 

and [see Eq. (5.12.52) and the discussion following it] Eqs. (5.12.49) and 
(5.12.47)] EO, A‘) can be bounded in C(40;, £5365; Aj) 2) C(o;, Eata Êj+1; 
Aj+1) by 

lAj+1 — Ayl < &, 

[EO (Al 2!)| < B3ejC87” oj, (5.12.88) 

[AP (A',z’)| < 2Byej,C6, F, 

Qj 


where 0; is defined as 09 with the index j replacing 0 everywhere. 


The maps C; are very close to the identity map on the very small set on 
which they are defined. In fact, setting |(A,z)—(A’,2’)| = |A—A’|+ @0|z—z'|, 


for every pair (A’, z’), (A”, z”) € C(0j41, 641; Aj41) it is: 
(GCA, z’) —C)(A",2")| <(149))\(Al 2) = (A",z")] (5.12.89) 


where 6; is a small number that can be taken to be 


0; = Bete 2 (5.12.90) 

Qj Qj 
which is implied by a simple calculation based on the dimensional estimates of 
the derivatives of Æ, A on the set C(oj+1,€j+1; Aj) (possible since Æ, A are 
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holomorphic on a much larger set, i.e., C(40;,6; — 36;;A,;)). Eqs. (5.12.74) 
imply that 0; =r very fast, in particular, Dei 0; < œ. They also imply 


that jet 9; sae 0. Then a torus can be parametrically defined 


(A,z) =Co+++Cn-1Cn(Ansi,z'), 2 ET! (5.12.91) 

which can be written more explicitly as 
A =Ay + a (p') 
p =o + BY’), 


where |=", |A(| are defined by comparison between the right-hand sides of 
Eqs. (5.12.91) and (5.12.92). By construction, a) and B are holomorphic 
on the multiannulus C(€,) D C(s) and also 


p'ET' (5.12.92) 


ja) (z) — a= (2)| + 99/8 (z) - B® (2)| = 
= |Co---Cn(An4i,2') — Co-++Cn—1(An,2’)| 


< (TJ + 4))|Cn(Ansi2") — (An, 2) (5.12.93) 
j=1 

z (Jie +0;))((An+1 — An| + [=| + ooe&|A™) ged oe 
j=1 


where |Z], |A®™®]| denote the right-hand sides of the second and third of 
Eqs. (5.12.88). Since on ws? 0 very fast, by Eqs. (5.12.88) and (5.12.74), the 
right-hand side of Eq. (5.12.93) is summable over n. Hence, the limits 


Ao (¥') = lim a™(p'), — Boo(¥’) = lim B™ (¢’) (5.12.94) 
exist and define (by the convergence theorem of Vitali on the sequences of 
holomorphic functions) two holomorphic functions of y’ in C(s). Via the 
parametric equations: 


‘eT! 12) 
eae + Balp) PS? Ce 
a torus T(wo) C So (Ao) x T! s defined. 

From Eqs. (5.12.88) and (5.12.74) one deduces that a. and Bə are small 
if £o is small, i.e., a property like Eq. (5.12.9) holds (possibly redefining 
B,a, b,c). 

So it remains to prove that T(wo) is an invariant torus run “quasiperiod- 
ically” with spectrum wo. The Hamiltonian flow s”, which describes in the 
coordinates defined by the canonical transformation Co -+ -Cn—1 the perturbed 


5.12 Small Denominators 511 


Hamiltonian flow S; associated with Eq. (5.12.1), is such that the coordinates 
of Sk” (An, p’) of SK” (An, p’) are 


An + @nO(Ent), yp’ + wot + O((1 + Ent)Ent) (5.12.96) 


because the Hamiltonian f,, contributes terms of order O(en) to the equations 
of motion. Of course Eq. (5.12.96) hold only as long as the point in Eq. 
(5.12.96) is inside C (on, En; An).*° 

If t > 0 is fixed, it is clear that 0,O(€nt) < on for n large, by Eq. (5.12.74), 
and, therefore, by Eqs. (5.12.89) and (5.12.96), we get 


[Co -Cn-1 (9 (An, %')) — Co++-Cn—1(An, 9! + wot)| 


ka (5.12.97) 
< ( [[c + 8;)) (QnO(Ent) + 000 (Ent(1 + Ent))) ===> 0. 
j=l 
Hence 
lim 9;Co+++Cn—1(An,y’) = lim Co- Cn- (S (An, 9") 
AIS mS (5.12.98) 


= lim Co ee -Cn-1 (An, yo + wot) 


but the first and third limit exist by Eqs. (5.12.91) and (5.12.94) and their 
equality means 


Si(Ao AE alp"), y’ T Boo (¥")) 


5.12.99 
= (Ao + Q(y’ + wot), yp’ + wot+ BbB + wo t)) ( ) 


which just says that 7 (wọ) is an invariant torus for the perturbed motion 
on which quasi-periodic motions with spectrum wo take place (t > 0 being 
arbitrary). mbe 


5.12.1 Problems 


1. Let A € So(Ao) and write Eq. (5.12.4) for v = e = (1,0,...,0) as Jur(A)|7! < C 
Deduce that this implies EgC > 1, with the assumptions and notations of, say, Proposition 


46 One finds this as follows: let SE (VAn, gy’) = (A(t), e(t)) so that 


Therefore by Taylor’s theorem and by a dimensional estimate, one finds after integration 
over t 


|A(t) — An| < Enont, lp’ (t) — py’ — wot| < (Enon | nent) t+ Ent 
which implies Eq. (5.12.96). 
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22. In the above context show that M;;(Ao)| < a by a dimensional estimate [see Eq. 
(5.11.18)], and deduce from this that 4M3 Eo > oo. (Hint: 1 = (|M(A)~!M(A))u = 
| har IM(A) gy M(A)r1| < Z= M(A)=}] or < 2 Za IM(A) -H= 


2. Consider the Hamiltonian on Rt! x RIHI, 


A2 B2 1 d 1 d 
z ta rere (p,p) ET xT, (A,B) ER xR 


Consider the motions near the resonating torus A = 1,B = 0 and write 
A=1+ vE ae(tVe), yp = ôe (t€), 
=Vebe(tve), p = Ye(tve) 


for the solution to the Hamiltonian equations with initial datum 


ae(0) = ao, be(0) = bo, ye (0) = Yo, ôe(0) = do. 


Show that the solutions to the Hamiltonian equations are such that aes, be, ye (but not ôe) 
have a limit as € — 0 and this limit verifies the equations 


à=0, y=b, poe A 
oy 
NES Fa) = 2m ECO, VE . Show that the limit is approached with a speed O(et) at fixed 


t. (Hint: Write the Hariltoniaa equations and note that, after dividing them by \/e, they 
converge formally to the above equations for a, b, y. Then apply the ideas of the proof of 
Proposition 13, p.186, §3.8, and of §3.7 and §3.8.) 


3. In the context of Problem 2, take d = 1. Show that “up to a time O(1/e)”, the motion 


is quasi-periodic with pulsations wy = 1,w2g = VE where 
0:70 


> 


an E dy 
Tam =j [2B — FO) 


where Eg = tog + f(yo) and y—,y+ are 0 and 2r if the equation Eo = f(y) has no 
roots; otherwise, they are two suitably chosen roots of this equation. Consider only the case 
Tbo,yo < +00 (however, the data for which Ty, 4, = +00 are exceptional). 


4. Find a result analogous to the one of Problem 2 near a general torus with rational 
pulsations for the solution flow of the equations associated with the Hamiltonian 


LA? tefl). 


(Hint: First extend Problem 2 to the case when f depends on A, B also; then canonically 
change variables so that the torus under analysis appears to be run with pulsations w = 
(wo, 0,0,... ,0).) 


5. Consider a time-dependent Hamiltonian with one degree of freedom, periodic in time 
with period 27: ho(A) + fo(A, ẹ, t); see Problems 12-14, p.478, §5.10. 

Suppose that ho is holomorphic in Soo (Ao) and that fo is holomorphic in C(@0, 0, Ao) = 
Soo (Ao) x C(Eo)? = {A, p, t|(A, z, ¢) € C3, |A— Ao] < 0,78 < |z| < e680, e780 < |¢| < ef, 
where z = e’?,¢ = ett. Using the formal perturbation theory of Problem 12, 85. 10, p.478, 
prove that if dho #0 and fo is “small” and 


jw(Ao)vi + v2|7} < C (i| + |v2l)®, vo#veEez?, 
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for some C, a > 0, then the perturbed motion, regarded as taking place on the space of the 
variables (A, y, t), leaves invariant a torus on which a quasi-periodic motion with pulsations 
(w(Ao), 1) takes place in the following sense. There exist two holomorphic functions ao, Boo 
on T? such that setting A = Ap +ac(y’, t’), p = y+ Boo(y’,t’), t = t', the solution of the 
equations of motion with datum assigned at time t’ and given by A = Ao +0 (y’, t), p = 
vy! + Bo(y’,t’) for some y’ € T1 evolves at time t’ +7 into 


A(T) =Ao + Qoo(y’ + wr, t +7), 


p(T) =p! + wT + bolp +wr,t +7), 


i.e., regarding the phase space as R x T?, the above motions can be regarded as taking 
place on a two-dimensional torus in R x T? and having pulsations (w, 1). (Hint: Just repeat 
the proof of Proposition 22. No real simplification arises in this apparently simpler case.) 


6. Consider the Hamiltonian (“Duffing oscillator”) H = ip? + tqt + £q sint. Fix an 
initial datum (po, qo). Show that if € is small enough, the trajectory with datum (po, qo) 
at any initial time to is uniformly bounded in time. (Hint: Show that po, qo is between two 
unperturbed tori in the phase space R! x T?, of the system with e = 0, having pulsations 
(w1, 1), (w2, 1) (see preceding problem) nonresonant and with finite resonance parameter 
C. Use Problem 5 to show that for e small, such tori are slightly deformed but remain 
invariant. Then use the fact that a two-dimensional torus in a three-dimensional space has 
an “interior” and an “exterior”.) 


7. In the context of Problem 5, define » = (y,t) and Eo > 


’ 


dh 
dA 


—1 
adh 
>» 0 2 (4 ) 
20 20 


£0 = 2E loo.£o + A | 2 loose: Then the condition of smallness of co for the property 
envisioned there is implied by the following condition, as can be proven:47 


107° (no £009 ')*(CEo)*Ceo < 1. 


Derive a similar formula (i.e., prove the statement in Problem 5, explicitly computing the 
constants) and try to improve it. 


8. Consider the system on R x T? (“Escande-Doveil pendulum” ) 


1 
50 +e (cos y + cos(y — t)), 


where t is the time; see Problems 12-14, 85.10, p.478. Apply the result of Problem 5 with 
the estimate in Problem 7 to place a bound on how large € must be in order that one 
cannot guarantee the “stability of the quasi-periodic motions with pulsations (wo, 1)” with 


wo = (v5 — 1) = {golden section }. 


9. Same as Problem 8, but applying the results of Problems 5 and 7 to the system ob- 
tained from the one in Problem 8, by first removing the perturbation to O(e) by ordinary 
perturbation theory; see Problems 12-14, 85.10, p.478. 


10. Same as Problem 9, but first removing the perturbation to O(e*). Check that in this 
way one obtains much better results. 


11. Suppose that, observing the motions of the system in Problem 8, one is able to see 
them with an absolute precision 7) of four digits (in decimal basis) and for an observation 
time T about equal to 50 periods of the forcing term, T = 50- 27. 

Note that (see (5.12.86)) to achieve a given accuracy for a given time, one only needs to 


47 193). 
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“remove the perturbation” to an order n such that O(enT (1 + E,T)) < N. Using this 
remark, estimate a threshold for the “survival” of motions which look quasi-periodic, within 
the error 7) up to time T, with pulsations (wo, 1), wo = $(V5 = 1), and compare the 
result with the experimental value of the “threshold of disappearance” of the quasi-periodic 
motion in question: €% 0.75. 


12. Try to compute the constants B, a, b, c in Eq. (5.12.6), explicitly improving the values 

of the constants B1-Bg suggested in the proof of Proposition 22. An example of a rigorous 
+4: 

result is 


C104% (No E003 Che) te, 00O <1. 


The following problems constitute a follow up of the problems in §4.10 on 
the theory of precession. None of the approximations suggested below for per- 
forming the lowest order perturbation theory is, strictly speaking necessary: 
the calculations could be easily carried out without any approximation at 
all, leading essentially to the same results. They would however be extremely 
cumbersome. In practice they have never been done because already with the 
approximations below it is clear that one has reached a precision where the 
non rigid structure of the Earth is important togheter with its density irreg- 
ularities, and the consequent non rotationally symmetric shape: therefore the 
use made of the following calculations is just to provide some formulae with 
free parameters to be used to perform numerical fits in the tables, much in the 
same spirit that animated the Greek astronomy (which is not a good reason 
for not trying someday a better calculation to test if newtonian mechanics can 
be applied to the theory of nutation to investigate the elastic properties of the 
planet). Let @p,7 be some approximations of the mean daily rotation angular 
velocity and of the mean inclination of Earth’s axis. Below we suppose, as it is 
the case for the Earth, that 1 >> (1— L? /A?)!/2 >> @»/@p (where we call õp 
the precession velocity calculated from the formula of problem (16) of §4.10, 
with such approximate values for the Earth angular velocity and inclination); 
this means that for many purposes the axis of rotation, the axis of symmetry 
and the axis of the angular momentum of the Earth can be confused, even 
though the theory is precisely looking for phenomena that exist just because 
such axes are not identical. 


13. In the context of problems (8) through (7) of $4.10, show that the Hamiltonian in §4.10, 
problem (14), can be written, if K = Kz: 


A? 3m kM gs K2 . 2 K?\ 1/2 L?2\ 1/2 
2J 2a3 s(a az) sie Ar y+(1 ae) ( ae) 


(A -ysinar -7+ y) + (Ž +) sine — 7 )) 


when one neglects, in the perturbation terms, (1— L?/A?) = sin? 0 (but not its square root) 
and the eccentricity of the Earth orbit. This means, as it appears below, that we neglect 
the difference between the Earth angular momentum axis, the Earth instantaneous rotation 


48 gee [11]. 
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direction and the Earth symmetry axis everywhere in the hamiltonian except in the places 
where they will produce the largest corrections to the equations of motion. 


14. Show that neglecting terms proportional to (1 — L?/A?) as well as the variability of the 
Earth axis the hamiltonian in problem (13) can be put, using the notations of the problems 
of $4.10, in the form: 


2 2 


A? 3 JK? 3 1 K? 
Hp =wrB 4 oF ser Az H smets 5(1 az) s2- 7)+ 


E 1)sin(à — y + p) +4 Š + 1) sin(A — y a(t T T 


A2 Æ 


Note that L is a constant of motion and therefore it will not be considered a canonical 
variable; the new canonical coordinates (B, A) have been introduced artificially to make the 
system autonomous. Note also that the parameters A,7 are fictitious parameters, so far, 
as they drop out of the above formula if Wp is fully rexpressed in terms the constants in 
problem (13). (Hint: note that if the orbit is regarded as circular then Ap can be identified 
up to an additive constant with the average anomaly; hence it rotates at constant rate wT; 


the auxiliary variable B will play no role here). 


15. The classical theory of nutation averages the Hamiltonian in problem (14) over the fast 
angles y, but not over the relatively slower angles A or over the very slow y. The Hamiltonian 
thus obtained should reliably describe motions over a time scale >> 27/wp = 1 day and it 
is: 
4&2 3 3 K? 7 
Hp = —+=mwz7J | 1- — J sin* (A - 
D Ta gi T A2 ) ( y) 

Show that this is integrable by quadratures and, setting y — wrt = ¥, reducible to the 
quadrature: 


ï dy! 

3 2 Hae 
49 —wr — S13miw7.(2K/A?) sin? 7! 

2 


3 
gieiswr ) sin? ¥ wrk wr Ko 


having called Ko, o, to the values of K,7,t when y — wrt = 0. Show that, neglecting the 
variations of K of higher order in 7 one finds that the motion is: 


í 3 2Ko,. 1 
y =— wot 5erdm TE (sin? (wrt + ào — yo) — z) 
; 3 K 

K =-— ee — G2) sin wrt + ào — 70) 


hence, recalling that cos ô = K/a and writing 6 = io + 6’ and y + wpt = 7’, it is: 


3 3 
= sn( ==) sin ĝo cos 2wr, y= sn( ==) cos ĝo Cos 2wT 
2 WD WD 


and we see that the two Euler angles expressing the deviations from the mean precession 
motion move on a small ellipse with a period equal, in this approximation, to 27/w7. This 
is the solar nutation motion. 


16. If the Moon is taken into account along a similar scheme one finds that the Moon 
nutation makes the ð’, y’ revolve still over an ellipse. The theory has to be done from the 
beginning as the main cause of the nutation due to the Moon is the fact that the plane of 
motion of the Moon is not fixed in space but has a precession on a cone of angle equal to the 
moon inclination iz, ~ 5° with a period of the 27 /wpL, for some wz. The nutation due to the 
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Moon comes out to be on an ellipse about 10 times larger than that calculated above for the 
Sun contribution and has a period of the order of 27/wpz,. Check that such period has the 
order of 20 years. (Hint: The precession of the Moon plane is due to the gravitational force 
of the Sun. One can imagine, for the purpose of studying phenomena that take place over 
a time scale large with respect to the Moon period of revolution (Tg =~ 27 days) that the 
Moon is uniformly spread on its orbit on an annulus of radius az, whose plane is inclined of 
iz to the ecliptic and which is rotating around its center T at velocity wz equal to the mean 
angular velocity of the Moon wz = 2r /Tz. The annulus is at a distance a from the Sun and 
gravitates around it with angular velocity wr, (neglecting the eccentricities of Earth and 
Moon), hence it has a precession that can be calculated from that of the Earth simply by 
using the value 7 appropriate for an annulus, i.e. 1/2 as the inertia moments of an annulus 
are J = Mya? and I = J/2. Hence the precession velocity is wpp = —(3/4)w2w; *.) 


18. Show that the generating function of the canonical map formally removing, from the 
above Hp, the perturbation to higher order is Aggy + Koy + BoA + & with: 


z] 1 q Kê 


1l A=) =k 


4wr A? 1 — wp/wr wp 

(1 BN, aye | Ko 1) cos(A — y + p) 
A2 A? Ao 1+ (wr — wp)/wp 

Ko +1) cos(\ — 7 — ¥) i 


Ao 1 — (wr — wp)/wp 


(Ao, Ko, PV à) = 


cos? 


= 


where wp = Ao/I3, and wp = @pA/2A2 cos?. 


19. Consider the canonical map with generating function Anpy+ Koy+BorA+(Ao, Ko, p, 7) 
and introduce the parameters € = (1 — L?/Ag2)1/2, cosig = Ko/Ao and set q+ = (1+ 
cos ig)/cosig. To simplify the calculations choose the so far arbitray constants 7,@p to be 
identical to i9,wp = Ao/I3. Show that with this choice of 7,@p the map is generated by 
the relations: 


Wp cos(A — y — P) cos(A — 7 + ¢) 

p =- — |u L tM 
WDE 1— (wr — wp)/wp 1+ (wr — Wp)/wp 
wp sin2(A — 7) 


Yo = 
wr 2(1 — wp/wr) 
A in(A-— y — in 2(A — 
ped oes: snaz v-o) Sna 19) ) 
WD 1— (wr — wp)/wp 1+ (wr — wp)/wp 
2(rA\ - 
Kakoa Eriti OE 
2wT 1 — wp/wT 


and, trivially, Ao = ». Neglecting terms of order O((wp/wr)?) as well as terms of order 
O(wpe/wp) and assuming that wp/ewp tan ig is very small the above relations are trivially 
inverted, up to corrections of higher order, as: 


cos(Ao — Yo — Yo) cos(Ao — yo + Yo) ) 
DE 


1— (wr —-wp)/wp ` (1+ (wr — wp)/wp) 
wp sin2(ro — 70) 


Wp 
p =p0 + — | 4+ 
w 


J= 
2wr (1 — wp/wr) 
e TIR TENER 

a=4(1- Wp esinio (a4 sin(ào — 0 — 0) 2, SQ 2t) 

4wp 1 — (wr — wp)/wp 1+ (wr — wp)/wp 

2(Ao — 

K =Ko(1 } vp - no E Qo w) 

2wr cos io 1 — wp/wT 
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and the equations of motion are, in the new coordinates labeled by 0: 


po =WD YO = Wp 


Ao =I3wp Ko = Iwp cos io 


and i0,Wp,Wp,€, as well as the initial data for yo, yo, Ao, must be regarded as parameters to 
be determined from observations. They define the mean inclination, daily rotation, equinox 
precession, and nutation constant. 

Show that the terms neglected in problems (13) thorugh (15) and above would add 
oscillating terms with much smaller amplitude. 


20. Consider the motions described in the new primed coordinates by the last equations of 
problem (18). The angles y, y can be thought to be animated by two distinct motions. The 
first are the two precession motions: 


proetwpt, Yt wpt 


are, respectively, the mean daily rotation and the mean precession of the equinoxes. The 
second motion is linearly superposed to the first and is the motion obtained by replacing A 
by A+wrt, po by yo + wpt and yo by yo + wpt in the trigonometric terms in the second 
of problem (18). The second motion is the nutation caused by the Sun. 


21. The axis ko with Euler angles (io, yo + wpt) is called the mean azis of rotation: it is an 
axis animated by a purely precessional, uniform, motion. Its node mg on the ecliptic plane 
is the mean node or mean equinox and it is rotating at uniform angular velocity wp. Show 
that, in the approximation in which the Earth axis, the angular momentum axis and the 
angular velocity axis are identified, the actual inclination 7 of the axis and the longitude ô 
of the apparent (i.e. the actual) node with respect to the mean node are given by: 


K 2(A — 
cosi =— = cos io (ee n EA) 
A 2wr cos io 1 — wp/wr 
in(A — yo — in(A — 
rou ree sin(A — 0 — Yo) Legs sin(A — Yo + po) ) 
2wp 1— (wr — wp)/wp 1+ (wr — wp)/wp 
in 2(A — 
ee tan io sata 70) 
2wWT 1 — wp/wT 
having set q+ = (1 + cos io )/ cos io. 


22. Consider the mean Earth axis and a plane orthogonal to it. Show that the coordinates of 
ig, the actual rotation axis, and those of the mean rotation axis and mean equinox, ko, mo 
are: 

ig =(sinisin(yo + 6), — sin i cos(yo + 6), cos i) 


mo =(cos Yo, sin yo, 0) 
ko =(sin io sin Yo, sin ig cos Yo, cos io) 
Show that the coordinates of the extreme point projected on the just constructed plane are 
given by: 
x =i3 -mo = sindsini 
y =i - ko A mo = sin yo(cos i sin io sin Yo — cos ig sini sin(yo + 6))— 


— cos yo (cos io sini cos(yo + 6) — cos i sin ig cos i cos Yo) 
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Show that if in the equations of motion one neglects the terms with the angles (A — yo + Yo) 
then the endpoint of the rotation axis describes an ellipse, i.e. (x,y) describe an ellipse. 


23. In general one has to take into account the force of the Moon, which in fact produces 
terms greater than the ones considered above from the Sun. However the motions will be 
basically of the same type: the nutation and precession will receive contributions also from 
the Moon and the other planets. If one really wants, one can improve the above description 
by distinguishing the three main rotation axes (the rotation axis, the symmetry axis and 
the angular momentum axis) and describe the motion of the Earth symmetry poles (polar 
motion) with respect to the instantaneous axis of rotation, which should be really taken as 
defining the equinox line: the motion thus described includes the so called polar motion, 
i.e. the motion of the angular momentum axis (and of the rotation axis) relatively to the 
symmetry axis. But of course the calculations become intricate and in the end they only 
provide formulae with free parameters that are determined empirically and used, as said 
above, for the preparation of the Ephemerides. The nutation motion is simply described 
by a motion of the Earth symmetry axis endpoint on an ellipse only if the very largest 
terms from the Moon contributions are considered: all the remaining corrections have the 
consequence that the motion of the pole around the mean pole is a quasi periodic motion 
with many periods (ranging from periods of the order of the day up, if one starts including 
in a very refined theory effects like the tides influence). But the quasi periodic motion 
can be a good approximation only as long as perturbation theory remains meaningful: the 
long time behaviour is possibly non quasi periodic and chaotic, even assuming the Earth 
perfectly rigid: but the chaoticity takes places over a very small scale as the corrections to 
the main nutational terms correspond to motions of the poles on the Earth surface of the 


order of 10m (and the main nutational terms correspond to the order of 100 m). 


6 


Appendices 


6.1 A: The Cauchy-Schwartz Inequality 


1 Proposition. Let 2 be a closed bounded Riemann-measurable (or Lebesgue- 
measurable) set in RÌ. Let f,g € C(Q) be two R-valued functions. Then 


| f FOOI < ( I rious) ( 1 sorae) l (41) 


PROOF. In fact VAE R: 


o< f re +rdteyrac= | Edt f HOE f oae. 


Hence, this polynomial of second degree in A must have a non-negative dis- 
criminant. Its discriminant is simply the difference between the square of the 
r.h.s. of Eq. (A1) and the square of the 1.h.s. mbe 


Exercise. 
Prove Eq. (A1) by remarking that (if volA is the volume of the set A) 


f , 1O94 = Jim YT SEDIE: )volA; 


where Aj, Ao2,... are a pavement of 2 with parallel cubes with side 6 and & € A;N R. 
“ ‘ a ity z4 24 
Then apply the “ordinary Cauchy inequality” 5°, |aibi| < (X; lail) 2 (0; |bi|7)2 to the 


sequences a; = f (€;)/voldj, bi = g(€;)/voldj. 
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6.2 B: The Lagrange-Taylor Expansion 


1 Proposition. Let f € C®@ (Rİ) and suppose that f has a zero of order 
(m+1)<k in xo. Then 


and the functions a ee vag (X) E€ Ck—m—1 (R24), If they are regarded as func- 
tions of (xo,x) € R24? then they are in C’—™-Y (R24), 


PROOF. Consider the function A > f (xo + A(x — xo)) which has in A = 0 
a zero of order m + 1, i.e., it has the first m derivatives vanishing. Then 


f(x) =f we (xo + A1 (x — xo)) 
d2 
=| a f dro Dg i (xo +2 (x — Xo)) 


AL Am qt 
z. dài | dd a3 a d\m41— FEL. f (Xo + Am+1 (x = Xo)) 
0 0 0 dx 41 


mM 


(B2) 


(1— A)” diet 
-f = gri f (Xo + d(x — xo)) 
Expressing the derivative with respect to ÀA in terms of the derivatives with 
respect to the x coordinates it follows, inductively 


1 qrti 


Gre Di gamer! o +A e- xo) 


2 a+! f (xo + Aà( (x = Xo) ) yy eee p= awe (B3) 
OLT... Ort 


peo 


a; >0, > ai Ries 


and this proves the proposition, showing that 


F wale) = f PEM nO Loot a) (B4) 


m! Ori 220085" 


i=l 


mbe 


Observation. The same proof holds if f € C“)(Q) and Q is a convex open 
set. 


2 Corollary. If f € CP (R4 x R”) has a zero of order m+1,m<k, in Xo 
for each y E€ R”: 
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d ee 
Fooy= E fonna [2 (sy 


and the hal oe f, thought of as functions of (xo,x,y) E R x RI x R”, 
are in CED) (RE x RI x R”). 

PROOF. It is a repetition of the above proof. 

3 Proposition. If f € C® (RI x R”) , the function 


Ot Ad f(x% eres 
rey- D Sey ee 


has, for eachm < k, a zero in x of order m at xo for ally E€ R”. Furthermore, 
the function in Eq. (B6) has a representation like the right hand side of Eq. 
(B5) with functions f having the same properties as those of Corollary 2 above. 


PROOF. One first checks that Eq. (B6) has all the x derivatives vanishing in 
(xo, y) up to order m. Then one applies Corollary 2 or repeats the proof of 
Proposition 1. This time, 


> eee mH f (xo + A(x = xo), y) 


fnanca X; y) = m! ðr ; On," (B7) 


6.3 C : C™-Functions with Bounded Support and 
Related Functions 


1. There is a nonzero function Ya € C™@(R), Ya > 0 with support in [0,a] 
a > 0, and one can take 


Palt) =0 if t g (0,a) 


ER (C1) 
Palt) =e 7-9? if t € (0,a) 


2. There exists a nondecreasing function g E€ C% (R) vanishing for t < 0 and 
equal to 1 for t > a > 0. For instance, 


t +oo 
= cf Walt)dr, mts Wa(t)dr. (C2) 


—Co 
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3. The function 


Ja p(t) = x(t - a+ a)x(B-t +a) (C3) 
has value 1 if t € [a, 8], O if t <a@—aort > +a and is non-negative. 
4. The function in C°(R4), 


d 
GET Sa) = J [ gog: (éi); (C4) 
=l 


has value 1 on the parallelepiped [a, 81] x ... x [aa, Gal; it is 0 outside [ay — 
a, 3, +a] x... x [aa — a, Ba + a] and it is non negative. 


6.4 D: Principle of the Vanishing Integrals 


1 Proposition. Lt f € C™ (|a, 8|) and suppose 


i f(t)z(t)dt = 0 (D1) 
for all z € CS (|a, 6]). Then f =0. 


PROOF. If f # 0, there is to € (a, 8) where f(to) # 0. Let [a, 6] c [a, f] 
be an interval around to such that |f(t)| > 4|f(to)l, Vt € [@, 8]. Let t - 


x(t), t € [a, 8] be a C% function positive in tp and vanishing outside [@, 8]. 
Then t > f(to)x(t) is in Cf (|a, G]) and 


B B 
o= f POXOD | OPXOFGd> 0. (D2) 


mbe 
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6.5 E: Matrix Notations. Eigenvalues and Eigenvectors. 
A List of some Basic Results in Algebra 


The reader who wishes more details (or proofs) on the subjects discussed 
below may consult [14] Chaps. 1 and 2. 


1. Given a lx m matrix J, and am x p matrix L, JL denotes the £x p matrix 
obtained by multiplying “rows by columns” the matrices J and L. 


2. If J is an l x m matrix and x € C™, we denote y = Lx the vector of C° 
with components 


w= Jase iht (E1) 
k=1 


3. The determinant, det J, of a matrix J is defined for all the square ma- 
trices. If J and L are two d x d square matrices, det JL = det J det L. The 
determinant is a linear combination of products of matrix elements. 


4. The sum of two £ x m matrices is an £ x m matrix with matrix elements 
given by the sums of the homonymous matrix elements of the two matrices. 
The matrix AJ, A € C, is the matrix whose elements are those of J multiplied 
by A. The modulus of an £ x m matrix J is 


, £ m 
ZEY dil. (E2) 


i=1 j=1 


If J is an £ x m matrix and L is a m x p matrix, 


IJL] < |J| |L]. (E3) 


In §5.12 (only) we use the symbol ||J|| for the right-hand side of (E2) and 
|J| for max|J;;|; then Eq. (E3) is changed by an extra factor £m in the right 
hand side. 


5. The dx d identity matrix, will usually be simply denoted by 1 and similarly, 
the product of A € C with the identity matrix will be denoted A. 


6. The eigenvalues of a square matrix are the solutions of the algebraic equa- 
tion in À (“secular or characteristic equation” ): 


det(J — A) = 0. (E4) 


7. The inverse matrix to a square matrix J exists if and only if det J 4 0 and 
it will be denoted J~!: it is characterized by the property JJ~' = JIJ =1. 
Its matrix elements are expressible as ratios of determinants of submatrices 
of J by the determinant of J. 
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8. If f(z) = Ð o Cn?” is a power series with radius of convergence 9 > 0 
and if J is a square matrix such that |J| < @ and if J? | 1, the series 


Co 


fi =X n)a (E5) 
n=0 
are absolutely convergent since 
fal = Do elz al aA lenlo” < 00 (E6) 
n=0 n=0 n=0 


They define a matrix that will be denoted f(J). 
If (P(z), Q(z)) are two polynomials and PQ(z) is their product polynomial, 
it is 


P(J)Q() = PRU) (E7) 


(if one thinks of the definition of the product of polynomials and of the fact 
that the product of matrices is distributive). 

Similarly, if f(z), g(z) are two powers series with radius of convergence o, their 
product power series fg(z) has the same radius of convergence and the above 
relation is generalized by 


fg) = fal). (E8) 


In particular, if |J| < 1, f(z) = 1- z, g(z) = (1-2) = 2”, fg(z)= 1, 
so that g(J) is the inverse to (1 — J); i.e., 


G= a J” (E9) 
and 
wa -1 Sa > n |J] 
| - J) a Se eae (£10) 


9. A real square matrix J is said to be “orthogonal” if J7! = JT, where 


(J?) = Ji, Vi, j. The orthogonal d x d matrices can also be thought of as 


“rotations of RI” . The rotation of R corresponding to the orthogonal matrix 
J will be the map of R? into itself: 

x— Jx (E11) 
10. If J is a d x d matrix and if y,x € C4, 


x- Jy =J'x-y. (E12) 
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11. The eigenvalues of a matrix enjoy remarkable properties. For instance: 


1 Proposition. If J is a dx d matrix with pairwise-distinct eigenvalues 
ài,- Àa, there are d vectors v,...,v € Ct, generally complex even if J 
is a real matrix, such that 


Jv(i)=Av,  i=1,...,d (E13) 


and they are linearly independent. 

If J is a real matrix, the eigenvalues and the eigenvectors can be arranged so 
that they appear in complex-conjugate pairs. 

If the matrix J varies in the neighborhood (in the sense that |J — Jo| is small) 
of a matrix Jo with pairwise-distinct eigenvalues, then the eigenvalues and 
the corresponding eigenvectors can be chosen and labeled so that they vary 
smoothly with J, i.e., so that the eigenvalue Aj of J and the corresponding 
eigenvector components (vk, j,k =1,...,d, are C® functions of the ma- 
trix elements of J. 


6.6 F: Positive-Definite Matrices. Eigenvalues and 
Eigenvectors. A List of Basic Properties 


The reader who wishes more details (or proofs) on the subjects discussed 
below may consult |14], Chaps. 1 and 2. 


Definition. A real matrix V = (Vij), i,j =1,...,d is “positive definite” if 


(i) Vig = Va, j =1,...,4; (Symmetry) (F1) 


ii) For all œ = (ay,...,a¢) ER, a #0 
(ii) ki id ? / 


d 
(aa) Y Y Vijaia; > 0 (Positivity) (F2) 
ij=l 


We now collect the main properties of the positive-definite matrices in two 
propositions. 

First, note that det V 4 0; otherwise, there would be ag Æ O such that 
Vao = 0, contradicting (ii) above. 

The following proposition states the “existence of an orthonormal basis on 
which V is diagonal”. 


1 Proposition. If V is ad x d positive-definite matrix, there exist d positive 
numbers \1,...,Aq and an orthonormal basis v,...,v in RI such that 


Vv) =rjv, =f =1,...,d (F3) 
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and the orthogonal matrix 


Jg = (v), ij=1,...,d (F4) 

is such that 
JVJ =A, V = JTAJ, (F5) 
where A is the diagonal dxd matrix with diagonal elements given by A1,...,Xa- 


Observation. Eq. (F5) implies that \1,..., Aq are the eigenvalues of V counted 
according to multiplicity. In fact, 


det(V — A) =(det JT AJ — A) = det(JT (A — A)J) 


d 
=det(A—d) = J [Aà - A) me 


i=1 
(since JTJ = 1, det J det JT = det JJ? = 1) 
2 Corollary. If V is a positive-definite dxd matriz, there is a positive definite 


matrix VV such that (VV)? = V. More generally, ifa € R, there is a positive- 
definite matrix V% such that, Va,b E€ R, VeV? = V%*? and V! =V, V° =1. 


PROOF. If A is a diagonal d x d matrix such that Eq. (F5) holds, we set 
A* = {diagonal matrix and diagonal elements à1,..., Aq}. Then A%A? = 
Att? Va, b E€ R, A! = A, A? = 1; so we set 


V° = JTAJ (F7) 
and V“ verifies the desired properties. V? = yV by definition. mbe 
3 Corollary. If V is adx d positive-definite matrix, there exists a continuous 
function p(V) > 0 depending on the matriz elements of V such that 

Va- a > (Vjal. (F8) 
In fact, u(V) = min; à;, are the eigenvalues of V. 


PROOF. d 
Vaa =J"A4Ja -a= AJa: Ja =X A(JBa)}? 
t=1 

(F9) 

>4(V) = $ (JBa)}} = (V) Ja: Ja 

t=1 
=u(V) JT Ja-a=p(V)a-a. 

mbe 


We conclude with a generalization of the above results. 
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4 Proposition. Let G,V be two positive-definite d x d matrices. There are 


d independent vectors v™,...,v € RI and d positive numbers 1,..., da 
such that 
Vv) =);Gv, = j= 1,...,d, (F10) 
Gv . y0) = i, tp Shed (F11) 
The numbers 1,...,Aq are the solutions repeated with multiplicity of 
det(V — AG) = 0. (F12) 


There is a function u(V,G) > 0 continuously dependent on the matrix ele- 
ments of V,G such that 


Va -a > u(V,G) (Ga - a). (F13) 


Observation. This is reduced to the preceding propositions. If w®,...,w® 
are the eigenvectors of the positive-definite matrix W = G-2VG"3, the v‘9) 
are 


vO) = Gr tw), al E (F14) 


6.7 G: Implicit Functions Theorems 


Let f € C®(R™ x RÌ) be a function with values in R4 associating to (x,y) € 
R™ x RÌ the value f(x,y). 
Consider the equation for y € R? parameterized by x: 


f(x,y) =0 (G1) 


which is a system with d equations in d unknowns y,..., Ya- 
Suppose that (xo, yo) E€ R™ x R? verifies Eq. (G1). By the Taylor theorem, 
see Appendix B, 


f(x,y) =J (y — yo) + L(x— xo) + N(x, y) (G2) 
where J, L are d x d matrices built with the derivatives of f: 


af 


ij Oy; (xo, Yo), i, j = E A (G3) 


af : 
Lij = 5 Œ y0), i=1,...,d, j=1,...,m (G4) 
Tj 
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and N is an R4-valued C®-function with a second-order zero in (xo, yo), see 
Appendix B, Proposition 3 with m = 1, k = +00 

The implicit function theorem compares the solution of Eq. (G1), written 
as 


J(y — yo) + L(x — xo) + N(x, y) =9, (G5) 


with that of the linear equations (d x d linear system) 


J (y — yo) + L(x— xo) = 0. (G6) 
If det J # 0, the matrix J7! exists and Eq. (G6) has the unique solution 


y — yo = — J7} L (x — xo). (G7) 


Therefore, it becomes natural to think that Eq. (G5) admits a solution 
differing from Eq. (G7) “by higher-order infinitesimals in x — xo”, since such 
is the difference between Eq. (G5) and Eq. (G6). More precisely, one can hope 
that there exists in a vicinity U of xo a function (x) such that 


f(x, p(x)) = 0, xe U, (G8) 
g(x) = — J7! L (x — xo) + B(x), xEU (G9) 


where € C™(U) and has a second-order zero at xo. 

This is, in fact, the content of the implicit function theorems. Since we 
shall also need explicit estimates of the size of the set U, on which ® can 
be defined, and on the size of P(U), its P image, it is more appropriate to 
describe the proof in notations which are convenient for us rather than to 
refer to a standard book. 

We first treat the d = 1 case, denoting In(x, 0) C R” the closed cube with 
center x € R” and side 2ọ. 


1 Proposition. Given 6 > 0,a > 0,a > ô, define 


- 10 
def 6 min |£ | 


= F r (G10) 
2 max(d [PEI + 1S) 


05,0 


where the minimum and the maximum are considered as x varies in Pn (Xo, a) 
and as y — yo varies in [—6,6] and we suppose that f (xo, yo) = 0. 

If 05, > 0 it is possible to define a function p in C®(Im(X0, 05.a)) verifying 
Eq. (G8) for every x € Im(Xo, 05,0)- 

Furthermore, all the solutions of Eq. (G1) in In(xo, 06,0) X [yo — ô, yo + ô] 
have the form (x, p(x)) and 


ay SL (x, p(x)) 
x) = —-— G11 
Be,” = "BE oe, ole) Ph 
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PROOF. Let (x, yo + ô) be a point on the upper face of the parallelepiped 
Im(Xo, 05,0) X [yo — ô, yo + 6]. We show that on this face f has a well-defined 
sign, opposite to the one it has on the lower face. 

Since of Æ 0 cannot vanish in the parallelepiped, by the choice of 95,4 and 
because 05,¢ > 0 this will imply that for each x € Im(Xo, Q5,.) there is and 
only one point y(x) € [yo — ô, yo + 6] such that f(x, y(x)) = 0 (note as that 
ot Æ 0 implies strict monotonicity). 

To show that f takes opposite signs on the opposite faces suppose, to be 
definite, of > 0 in In (Xo, 05,a) X [Yo — ô, yo + 6]. Then 


f(x,y + 6) =f(x, yo +4) — f (Xo, yo) 


Se er cee tic ea, 


and we apply the Lagrange theorem to find x and y, intermediate between x 
and xo and between yo and yo +6, such that the right-hand side of Eq. (G12) 
can be written 


ð n afa 
feu +8) = me SEO yo)(x — xo) 
— d 
P ln (G13) 
> (min |2) ~ (mex ge 05,0 


Similarly, one proves that f(x, yo — < 0, YX € Im(Xo, 05,a). This proves 
the existence of G(x) and its uniqueness. 

To show the differentiability in the direction of the axis e = (e1, . . . , €q) ob- 
serve that, given x € Im(Xo, 05,0) and given e such that xp tee € Im(Xo, 05,0) 
and if x, y are suitable intermediate points between x and x + ce or (x) and 
p(x + €e), one finds 


0 = f(x + ce, p(x + £e)) — p(x, 9(x)), (G14) 
T əf of 
0= Deh &Net ED- v(x) (C18) 
i=1 f 
by the Lagrange theorem, and this shows that 
max 
ola +60) — pl) < MEE al (c16) 
min |3 | 


i.e., y is continuous. Eq. (G15) also yields, dividing it by £ and letting « — 0, 


fem 2E tee) = olopa) _ = UE e aE e) 


e>0 e o Aek) 


i (G17) 


proving the differentiability of y and Eq. (G11). 
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By the chain-differentiation rule of composed-function, Eq. (G11) implies 
that sf are differentiable in x and their derivatives can be expressed in 
terms of y, of its first derivative and of f and its first two partial derivatives. 
Therefore, woe are differentiable, etc., i.e., y E€ C% (I m(X0, 05,0): mbe 


Observation. It appears from the proof that the same results hold if no relation 
is assumed a priori between a and 6 provided 5,4 replaced by 


z a in| 35 
= betwee and Ls | G18 
05,0 { minimum ween a and 5 "SrA A (G18) 


which is a better result; see Eq. (G13). 
To deal with the general case, introduce, given a matrix M, 


IM| = X (Mi, (G19) 
i,j 
and note that |M - N| < |M||N| if M - N makes sense, i.e., if the number of 
columns of M equals that of the rows of N. Also define the matrices 


def Of def Of 


J = L = 2 
(x,y) By’ eI) = 5 (G20) 
2 Proposition. Given 6,a > 0, define 
ef 1 ô — 2(max| J7! a max |ÎN | + ô max | 2N 
oal (max |J~"|)( eral lSy |) (21) 


2 max |J—!L| 
with the maxima taken on In (Xo, a) x Tayo, ô) and set 05,4 = 0 if J7! does 
not exist at some point of this set. Suppose f (xo, yo) = 0, anda > @a,5 > 0. 
It is then possible to find p € C®(Im(X0, 06,0)) with values in Talyo, 6) ver- 
ifying Eq. (G8). 

Furthermore, all the solutions of Eq. (G1) in In(xo, a) x Ta(yo,6) have the 
form (x, p(x)) and 


d 


Pk Of (x, p(x)), -1 Of: (x, p(£)) 
Oxy > Coy i a h x € I'(X0, 05a) (G22) 
=1 
Observations. 


(1) Note that the above proposition is nonempty. Using the factN has a 


second-order zero at (xo, yo), given B > |J(xo, y0) !JL(xo, yo)| 7}, we see 
that for ô small enough (depending on B) and a = Bo it is: 
1 ô ô 
0 < = — a < —— < B 2 
oman i eeu a G23) 
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(2) There are two methods to prove a theorem like the above. The most 
natural would be to deduce it as a corollary of Proposition 1. One would just 
proceed by substitution as in the solution of the linear systems. 

The assumption det J Æ 0 implies that there is at least one derivative 


of Then we apply Proposition 1 to the function f = foD with y = yı 


and x replaced by (x, y2,..., Ya) and call y1 (x, y2, . . - , ya) its solution defined 
close enough to Xo, Yo2; Yo03,- --, Yoa. Then, supposing 7; = 1, consider 


J” (x, P1 (X, Y2, A , Ya), Y2, ita , Ya) =0 
seed (G24) 


f(x, P1(X, Ya,-- ~, Ya), Y2,- -< Ya) =0 


The determinant of the Jacobian matrix J, of the left-hand side of Eq. (G24) 
with respect to y2, ..., Yq cannot vanish in xo, yo2, Yo3;---; Yoa because it can 
be shown to coincide with the determinant of the linear system of equations 
obtained from the system J(xo,yo)& = 7 by solving its first equation with 
respect to €; and substituting into the others. Therefore, we can again apply 
Proposition 1, expressing, say, y2 as a function of x, y2,...,Yyq close enough 
to X0, Yo3,---,Yoa etc. The only difficulty is that the left-hand side of Eq. 
(G24) is only defined, and C™, in a small vicinity of xo, (yo)2,.--,(yo)a; 
and not on all of R™ x R4, as would be required by Proposition 1. This is, 
however, an obviously trivial difficulty. What is more difficult in this method 
is to keep track of the size of the neighborhoods involved, in order to obtain 
an explicit formula like Eq. (G21). Therefore, here we shall adopt another 
classical method of proof. The triumph of the naive substitution method will 
appear in Appendix N where, however, additional assumptions on f are made. 


PROOF. Write Eq. (G1) as Eq. (G5) and let 


y — yo = -J "L(x — xo) — J N(x, y) (G25) 
for (x,y) € Im(xo,@) x Ia(yo,6). Note that y’ — yo| < ô if (x,y) € 
Im(Xo; 05,0) X Talyo, ô) and if (as supposed) a > 05,4 > 0. In fact, by the 
Lagrange theorem and N(xo, yo) = 0, it follows that 


ly’ — yo] S|J*L| 05,0 + |J! (N(x, y) — N (Xo, Yo))| 
i 
ð 


ƏN 1 (G26) 


<| JT L] 06,0 + maz|J~*|( 


Therefore, at x fixed in In(Xo, 05,0), Eq. (G25) yields a map of T4(yo, 05,a) 
into itself. We can, therefore, recursively define, for each fixed x € Im(Xo, 05,0); 


Yn —Yo=—J 'L(x—x9)—J-'N(X, Yn-1 (G27) 
n=1,2,.... Then, 
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lyn T Yn-1l = |J—*(N(x, Yn-1 — N(x, Yn—2)| 
x ON 1 (G28) 
< (max | J *1) (max 51) IYn—1 — Yn—2| < gl¥n-1 — Yn-2| 


having used in the last step the hypothesis 93, > 0 which implies that 
(max |J~"|) (max |S¥}) < Z. Therefore, |yn — Yn-1ı| < De — yo| and 
there exists the limit 


p(x) = lim yn = yo + So (ye — yr) (G29) 
k=1 
If (x,y) is another solution to Eq. (G1) in In(Xo, 05,0) X Ta(yo, ô), we can 
write Eq. (G1) in the form of Eq. (G5) for y and (x) and subtract 


IF- p(x) = |J~* (N(x, F) — Nn(x, o(x))| < iy — (x)| (G30) 


i.e. y = v(x), proving uniqueness. The differentiability statement is proved 
as in Proposition 1. mbe 


3 Corollary. Under the assumptions of Proposition 2, let m = d and, see Eq. 
(G23), give B,C > 1 such that 


B > (min |JTHL) t, C > (min |LT1J E, (G31) 


where the minima are taken over Ta(xo, Œ) x Talyo, 6) with given @,6 > 0. 
Suppose that 6 > 0 is so small that 6, Bô < @, ô. 
Define 0a 5 as in Eq. (G21) and Ùa s 


E ƏN ƏN 
z a R (a max |S>| + l) (G32) 
ep max |L~1J| i 


where the maxima are now considered on Ta(xo,@) x Talyo, ô) both for 05,a 
and Osa . Then if 6 is so small that 
0< 0! osBs<Bô and 0< 55a 


Sar re ô (G33) 
(which is possible by Observation (1), p.530), the p-image of I'a(xo, @) covers 


Talyo, @). 


Observations. 

(1) This means that if the Jacobians of f with respect to x and with respect to 
y have non vanishing determinant at (xo, yo), the f sets up a correspondence 
between x,y near Xo, Yo of class C™, with inverse of class C™, and sending 
open sets onto open sets (it is a local “C° diffeomorphism”). 

(2) Since Corollary 3 is quantitative, it says much more: it gives, in fact, 
estimates of the size of the regions where f can be inverted. 
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PROOF. Just apply Proposition 2 twice, to express y in terms of x and vicev- 
ersa (make a two-dimensional drawing to better understand the situation). 
mbe 


Another important application of Proposition 2 is the following corollary 
used in §5.10. 


4 Corollary. Let £ € C®(T®) with values in R. Consider the equation for 
peT: 


p =e+efle) (G34) 


with € E R} and suppose max|f(p)| <1, max| E (4)| <1. 
There is €¢ > 0, depending only on L and not on f, such that, Ve < cg, the 
above equation can be solved uniquely in the form 


p= +eg(y'e) (G35) 


with g € C®(T®) at fixed e. Furthermore maxy |g(p,£)| < 1, and if p verifies 
Eq. (G34), then it is given by Eq. (G35) up to 2nrv, v € Z°. 


Observation. This is a “global theorem” involving an inversion on a large set, 
namely, JT‘. It can be improved to cover the case when f depends parametri- 
cally on some A € R? so that (A, 4) — f(A, p) is a C® function on R? x T. 
Then if f verifies the assumptions of the corollary for each A € V C RP, one 
can check that g € C®(V x T°), Ve < er. 


PROOF. Let 0 < e < 4. The Jacobian matrices L, J of Eq. (G34) regarded 
as an implicit equation F(y, y’) = 0 in RE x R! near the solution (po, go + 
ef(po)), with po given in T°, are 


Of; 
Lij = 6:5, Jij = Lij +e, G36 
eg j PeT (G36) 
and by assumption [see Eqs. (E2), (E3), and (E10)] and since e < 4: 
(£ aE (£ Lee S (G37) 
4 4” 2 2” 


so that the constants B,C in Eq. (G31) can be chosen B,C > (¢— 4)~1. We 
now apply Corollary 3 to our equation near (po, po + Ef(Yo)) by choosing 
ô = y£, say, and noting that from Eqs. (G21) and (G32), it follows that for € 
small enough, 


ô 
——__ < Bo < Bå G38 
IIF = 0 < Bô, (G38) 


Noting that 6 >> £, we see that Corollary 3 implies that as pọ varies on T°, 
the point po + ef(~o) also varies covering T° if e is small enough. 
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Furthermore, the map of Eq. (G34) is one to one, for € very small, as a 
map of T° onto itself. In fact, if p1, 2 € T“ and if the segment o given by 
t = pıt + p2(1 — t), t € [0,1], is the shortest segment on T* connecting p1 
and #2, we see that the points p} = pı +ef(y1) and ys = po +E f(p2) can 
coincide mod 2r only if y = h, if e is small. 1 

Since f is periodic, the assumption that ø is the shortest path on 7“ leading 
from Qı to p2 cannot be restrictive and, therefore, the map y > ẹ + ef(~) 
is one to one for € < 1. 

So the map of Eq. (G34) can be inverted on T* and its inverse map y’ > 
F.(y’) is C® near every point if £ is small enough. Clearly, Eq. (G35) holds 
with g(y’,c) = —f(F-(y’)) which also proves |g] < 1. mbe 


Concluding Remark 

The above proofs do not really make use of the fact that f is of class C™. 
If f is only supposed to be of class C\*), k > 1, the ideas of the proofs still 
work, and the only difference will be that the inverse function y will not turn 
out to be of class C™, of course, but only of class C(). We use the above 
“C(*)_version” of the implicit function theorems only in §5.7. 


Exercise 
In the context of Proposition 1, compute the second derivative of f(x) in 
terms of f and of its first derivatives fe and in terms of f and of its first two 


derivatives. (Answer: 


O7 f(x,y) , f(xy) day OF (x,~) (OF(x~) , Fx e) 3 

Op Dade + Bad Bey | Oe (Oby + oe) y 

Ox j OX; E Of (x,¢) TEn ; 
Oy by 


6.8 H: The Ascoli-Arzela Convergence Criterion 


The following elegant proposition is famous. 


1 Proposition. Let 2 be a closed bounded set in RÌ. Let (fn)%X 9 be a se- 
quence of continuous functions defined on 2 such that: 
(i) The sequence (fn) is “equibounded”, i.e., there exists M such that 


fall = mage |fn(€)] < M (H1) 


(ii) the sequence (fn) o is “equicontinuous”, i.e., given © > 0 there exists 
ô- > 0 such that 


sup |fn(€) — fn(€)| < e. (H2) 


n,|§—§'|<de 


1 In fact, |1 — p2| cannot be too large (< 7) if ø is the shortest segment joining pı and 
p2 on Te. 
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Then there is a subsequence (fn,)?29 such that the limit 


f(E) = lim fn; (&) (H3) 


i— 00 
exists, uniformly, VE € 22. 


Observations. 

(1) Hence f is continuous on N. 

(2) The most interesting aspect of this theorem is the uniformly of the con- 
vergence. 


PROOF. Let 2 C Q be a denumerable dense subset of N (to be concrete, 
think of the case when (2 is a square and 29 is the set of its points with 
rational coordinates). We shall write 2o = {&1, €2,...}. 

By the equiboundedness condition, it will be possible to find a subsequence 
(fnide9 Of (fn)? o such that the limits 


jim fal) F FE) (H4) 
exist. For instance, one can use the Cantor diagonal method; f is defined by 
the right-hand side of Eq. (H4). 

Without loss of generality, we may and shall assume that the subsequence 
(n;)$9 coincides with (0,1,2...), i.e., that the limits lim,—oo fn(&;) exist 
without passing to a subsequence. This will now be used to show that the 
function f defined on 29, can be extended to N by showing that the limit 
limn—oo fn(€) exists VE € N.. In fact, we show that (fr(€))% 5 is a Cauchy 
sequence for all € € 
O. x oe 

Let € € 2. Given € > 0 let € € 2 —0 be such that |E — €| < ô+, see (ii); 
then, by Eq. (H2): 


|fn(€) — fin (€)| <lfn(€) — fn (€)| + [fn (€) — Fim (€) 


+ |fm(E) — Fm (6)| < 2E | falE)-— Fl El rmm 22 
(H5) 
because f(€)?2 is a Cauchy sequence. Hence, by the arbitrariness of e, we 
see that (fn(€))°%o is also a Cauchy sequence and we can define, VE € Q, 


FCE) = limnsoo fn(€). 
If €,n € 2, |E- n| < 62, t then follows from Eq. (H2) that 


IF) — FMI = lim |falE) = fn] < € (H6) 


It remains to show that the limit given by Eq. (H3) is uniform on 9. Otherwise, 
we could find € > 0, a sequence n; ———> œ and points x; € R such that 


1— 00 


| fn, (xi) — f(xi)| >e, TSL 2era (HT) 
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Assuming (no loss of generality) that n; = i i.e., 


lfn(Xn) — f(Xn)| >€, Pele et (H8) 


This is impossible because there would be an accumulation point X € 92 for the 
sequence zn, n = 1,2... and again we may assume, without loss of generality, 
that lim xn = X. Then if |X — xn| < 61,, using Eqs. (H6) and (H2), 


1 
ae? 


E <|fn(Xn) — f(Xn)| < | fn(%n) = f En) + fan) — f(Kn)| 


H9 
+I) — f&n) < Z + aE) + nE wae e (429) 


which is a contradiction. mbe 


2 Corollary. Under the assumptions of Proposition 1, aside from that of 
boundedness (or of closure or both) for 2, the same conclusions hold with the 
exception of the uniformity of the convergence of fn,(€) to f(€). Nevertheless, 
f is uniformly continuous on 22. 


PROOF. By inspection of the proof of Proposition 1. 


Exercises 


1. Let (fn)%2 9 be a sequence of C@)(Q) functions on a convex set which is the closure 
of its interior. If there is M such that sup, maxgey | fis) | < M then (fn)? > is an 
equicontinuous family on £2. (Hint: Express the variation of f, as the integral of its derivative 
along a segment joining two points.) 


2. Define C(€) (2), e € (0, 1], to be the set of the functions such that 


fle% sup | fl + J If) - FOI oy 


ay X=yl 


Then any sequence(fn)°_9, fn € Cc) (Q), such that | fnle < M < +00, Vn, is an equicon- 


tinuous equibounded sequence. 


6.9 I: Fourier Series for Functions in C~((0, L]) 


Lemma 11, 84.5, p.266, will be proved here. 
IfueC ((0, LJ), set 
u (2) =u(a), z € [0,1], Pm 
u*(L+ x)= —u(L—2), x € [0, L] 


and, by the assumption that the even derivatives of u in 0 and in L vanish, 
the function thus defined on [0, 2L] is in C% ([0, 2L]) and is periodic, together 
with all its derivatives, with period 2L. By the Fourier theorem, we set 
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~* 1 es * ihr 
w, = — u*(x)e T? dx (I2) 
0 


2L 
1 is a a 
=T f u(x) (eiTe — eTe) dx (I3) 
L 
= h = 
=> f u(x) sin — dz = =h) = —U",, 
having used the change of variables x — L — x.. Therefore, for x € [0, 2L], 
u* (x) = a +004}, dTe — 5 +oo%, sin uo (14) 
h=—oo l h=1 L 


Hence, for x € [0, L], 


+00 
h 
u(x) = 7 sin ae (15) 


where U, defined in Eq. (13) coincides with Eq. (4.5.20). Equation (4.5.21) 
follows from Eq. (I3) and from the decay properties as h — oo of the Fourier 
coefficients for C'°-periodic functions. Equation (15) gives Eq. (4.5.22). mbe 


6.10 L: Proof of Eq. (5.6.20) 


Let (S'° (wz, w2)); f oilt, w), i= 1,2, t € [0,1]. Eqs. (5.6.17),(5.6.18) give 


oilt, w) =w; +f xX5(a(7, w)) (ao (tT, w) + P(o(7, w))) dr, 


t 
Li 
oz(t, w) =e” we + | e —7).y 5(a(7, w)) Q(a(r, w)) dr ae 
0 
Consider, for instance, se and drop the w in the arguments of o, for sim- 


plicity. Note that 
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ar = [ {axs(o(r))- OT) TA a Dis ey) 
Oo1 (T) 


+ x6(@(7)) (a 


Oo2 á —Vo(t-T ðo (7T) 
-j erol {Axs(o(r)) Dw, Q(a(7)) 


Ow, 


Oo (T 
+ xa(o(r)) 3Q(e(7)) - 22D ar 
W1 
where Og denotes Com 2L) if g is a function of w1, we and possibly other 


variables. Hence, using Eq. (5.6.15) and the fact that P and Q have a second- 
order zero at the origin, we see that there are two constants p,q such that 


[22 1) <p fj {5] 922 (als +52) + lall + aOd (23) 


Ow, Ow1 


(since |a(r)| < 6V2) and 


Og f {2 PEO oe «a SD) (14) 


Therefore, adding and subtracting 1 appropriately: 


c(t 
EE E + 2p(lal +8) 
Ow, 
t 
|Oo1 (rT) |Oo1(T) 
. eee | EA 
pie -e (L5) 
onih [ |Oo1(T) Oox(T) 
< ae 2 AN 
Foe | Sa 208 f {I FIG a 
Setting y(t) = i + 20], the preceding inequalities, added up, imply 
t 
y(t) < 2(p + a)(lal + jt + 2(p + a)(lal + 5) | y(r)dr (EB) 


and y(0) = 0. The above integral inequality implies y(t) < Y(t), Vt > 0, where 


IE < 2p + (lal +4)t+ 20+ aN(al-+8) f wr)dr ET 
and y(0) = 0 (see Problems 8 and 9, §2.5). Hence, 


T(t) = (POHD — 1) < Millal +) (L8) 


for 0<t< 1, |a| <1, <1 (and M could be 2(p + q)e*?t). 
An identical argument could be given for t € (—1, 0). 
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6.11 M: Proof of Eq. (5.6.63) 


Let 
(x, m(x)) = S{ (£0, m(a0)), (x, my (x)) = SE (wh, m(2g)) (M1) 


Then from Eq. (5.6.33), it follows that 


t 
|\7r4 (a) — Ty (x)| < |e~”' (a) = e™™ot a(x) )| iy p dr e7 volt-7) 
t 5 (M2) 
Zs(S1® (xo, T(£0)), a) -f dt 7 rolt =T) 75 (G8) (xp, (2g)), a)l. 
0 


Using Eqs. (5.6.25), (5.6.24), and (5.6.49) and supposing 0 < t < t < t4}, the 
right-hand side of Eq. (M2) is 


Sew — "| |(xo)| + e™* |(ao) — (a9) 


t’ 
+M- r+ | je- — erot -7| Mô? dr (M3) 
0 
t 
-J er) |Z5(S{® (x0, m(0)), a) — Zs (SP (ah, (2h), a)ldr 
0 
<dvp|t — t| + |r(xo) — 1(xp)| + (1 + vot) M6? |t — t'| 
+ 2Mtd(1 + M (a+ + 8)t)(|£o — zol + |7(z0) — a (2)I) 
<(6v9 + (1 + vot) M5?)|t — t'| + { (evo + 2Mt5(1 + M(a4 + 5)t) 
+ 2Mt6(1 + M (a+ + 6)t)CV8)} a0 — x6]. 


To estimate |£o — 76| proceed as in subsection 5.6.G, p.421, using the expres- 
sions analogous to Eq. (5.6.58): 


t 
ets | drX5(S°° (x, m(2)), a), 
0 


(M4) 
To =s- f drXs (SÀ (x, ny (2)),a), 
0 
By Eqs. (5.6.25), (5.6.23), and (5.6.20), 
t’ 
EE E AS +f fe 
0 
-|X5(S& (x, m(a)), a) — X5(S“ (x, my (a), a) (M5) 


< M(a46+62)|t — t'| +2M(ay, +t + M(az + Hml) — ty (2). 
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The restrictions imposed on a, to, by the second of Eqs. (5.6.41) imply (recall 
that C,d < 1) 


0 =(CV6 + 2tM6(1+ M (a4 + d)t)(1 + CV5)) 


-2M (a, +6)t(1+ M(ay + ôt) (M6) 
<+ a0 +s ))0+) G42) <5 


By combining the last of Eqs. (M5) with the last of Eqs. (M3), it follows that 


(1 — 0)|me(x) — mer (x)| < (dv + (1 + vot) M6")|t — t'|, (M7) 


so that, since 0 < 4: 
|re(x) — me (x)| < 2(dv9 + (1 + t+) M6?)|t — t'| (M8) 


Vt, t € [0, t+]; hence, by Eq. (5.6.51), for all t,t E R4, Y <t, |t- t| < t+. 


6.12 N: Analytic Implicit Functions 


The proofs of Propositions 20 and 21, §5.11, are based on the following idea. 
Let F be a holomorphic function of a single complex variable z € R C C. 
Assume that its complex derivative, denoted by a prime in this section, F” (z), 
does not vanish in (2. 

It is a consequence of the theory of power series that, as z’ varies in a 
small vicinity of z4 = F(z0) and z varies close to zo, the equation z’ = F(z) 
can be uniquely solved for z by a function I defined in a neighborhood U of 
zo and holomorphic in U: 


F(I(z')) = 2! (N1) 


for all z’ in U, and 


I(F(z)) =z (N2) 


for all z in a suitable neighborhood of zo. The function J has Taylor coefficients 
in z4 which can be computed via a simple algorithm from those of F in zo. 

The function F will be invertible on the whole F'(2) if and only if F(z) 4 
F(z’) whenever z # z’. In this case the inverse function J will be holomorphic 
on F(Q) and it will be the unique inverse of F defined on F(N). 

A simple criterion implying that F(z) 4 F(z’) for z 4 2’ is the following. 
Suppose that for every pair z,2’ € Q there is a smooth curve A(z, 2’) C 2 
with length |A(z, z’)| bounded by 


A(z, 2') < B(Q)lz — 2’, (N3) 
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where 6(§2) is a suitable constant. Then F will be a one to one map between 
Q and F(Q) if 


o = 3(92) sup |F’(z) -1] <1 (N4) 
zEQ 
In fact (N4) implies 
F@-Fe Sif POs f a 
A(z,2’) A(z,2’) (N5) 


+f (F'(© -1)de| > k= 2/|- ele - |= 0- ole #1. 
A(z,2') 


Proposition 20 can be proved by using the above remarks. First consider the 
inversion problem for the equation 


gy’ =ypt+g(y) mod 27 (N6) 


with y € T* and g holomorphic on C(€). Let g be the holomorphic extension 
of g to C(€). Eq. (N6) can be written 


z! = ze) = F(z), zeT! (N7) 


Let 5 € (0,1), 6 < 4€ (say); we regard (N7) as an equation for z € C(€ — ô), 
i.e., Q = C (£ — ô) in the language of the above discussion. 

Between any two points z, z’ € C(€) draw a line A(z, z’) contained in C (£) 
with length < 2r|z — 2’|: i.e. B(C(E — ô)), see Eq. (N3), can be taken = 2r. 
Hence Eq. (N7) can be inverted in C(€ — 6) under the condition 


Qn sup [eI —14 ig (ze) <1 (N8) 
z€C(E-6) 


which also ensures that F’(z) 4 0 because 


F'(z) =14 (e) —1) +g (ze, (N9) 
The supremum in inequality (N8) is bounded dimensionally, as in (5.11.18): 
Qn (elle —1)+ elle e& 5-1) < 2re*elsle|g|e5-! (N10) 
By the above analysis a function I(z’) on F(C(€ — 6)) can be defined with 
RG) 25 Yz’ € F(C(E —59)), provided (N11) 
4re?elIle gjg! <1 (N12) 
The form of F, 


F(z) = ze), (N13) 


542 6 Appendices 
implies that 


F(C(E— 6)) CCE — ô — |gle) (N14) 


because the F-image of C (E — 6) consists of two lines outside C (E — 6 — 
gle) and the boundary of F(C (£ — 6)) is F(OC(€ — 6)). The latter property 
follows from general properties of holomorphic functions but it can also be 
seen directly in our case as follows. If z9 € F(C(& — 6)), there is a sequence 
Zn € C(€ — ô) such that 


z0 = limn > œF (zn) (N15) 


and, without loss of generality, we may suppose that the sequence zn converges 
to a limit zo. If zo E€ OC(E€ — 6), then z E€ F(C(E — 4)); if zo € C(€ — ô), then 
the local invertibility of F implies that zo is interior to F(C(€ — 6)) which is 
impossible. 

Therefore if Eq. (N12) holds the function I inverse to F is holomorphic at 
least in C'(€ — 25), because Eq. (N12) implies |g|e < 6. Assuming the validity 
of the inequality in Eq. (N12), set 

A(z’) = -g(I(z’)), z’ € O(€ — 20). (N16) 


This defines a holomorphic function on C(€ — 26) such that 
[Ale-25 <|gle, and (N17) 


I(z') = i40’), (N18) 

As z varies on the unit circle, the point 2’ = F(z) also varies on the unit circle 

so that A is real on the F-image of the unit circle: since |gle < ô and F is 

given by Eq. (N13) it follows (by a continuity argument) that as z varies on 

the unit circle z’ varies covering the entire unit circle. This means that A is 
real on T+ and it becomes possible to define 

Aly) = Alet?) (N19) 


and A is analytic and real on 7+. 
Since 6 is arbitrary in (0, 46), replacing 26 by 6 the theorem is proved, in 
the case considered, under the condition 


8refelelgled-) <1 (N20) 


Next we study the inversion problem for the equation 


gy =9+9(A,¢9), (N21) 


where g is holomorphic on C (o, €; Ao). We write Eq. (N21) as 


gaz eTA) (N22) 
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Repeat, VA € S,(Ao); the above argument, once more keeping in all the for- 
mulae an explicit A dependence which will, however, play no role whatsoever. 
So Eq. (N22) will be invertible in the form 


z = z’ BA) (N23) 


with A holomorphic on C(ọ,£-— 28; Ao) if Eq. (N20) holds with |g|¢ replaced 
by |glo,¢. The function A will also turn out to be real for A € S,(Ao), i.e. for 
A real and for |z’| = 1. 

The same conclusions hold if g is defined and holomorphic on a more 
general set of the form W x T+ with W C C° open. Eq. (N22) is inverted 
by Eq. (N23) if Eq. (N20) holds with |g|¢ replaced by the supremum of g in 
W x C(x). 

With these remarks in mind, the proof of Proposition 20 can be concluded. 
Consider the case contemplated in Proposition 20: 


p' =p+gA, g) (N24) 


with g extending to a holomorphic function on C (o, £; Ao). Write the system 
of Eq. (N24) as 


zh = zese (A2), k=1,...,p, (N25) 
and consider the first equation for 21: 
zi = 2689 (Azzo), (N26) 


If Eq. (N20) holds with |g|o,¢ replaced by |g|o,¢, we can invert Eq. (N26) as 


ay = zetia (Azio) (N27) 


with A, holomorphic for A € S, (Ao), z, € C(€—0), and zp € C(&) for all 
k = 2,...,p. Also, |gloe < ô. Furthermore, Eq. (N27) inverts Eq. (N26) on 
the same set A € S,(Ao) x C(€ — 6) x C(€)*1, and 


|Ai| < |giloe < Igloe (N28) 


where lâl denotes the supremum of A; on its domain of definition. Finally, 
Aj, is real if A € S,(Ao), |z] = |z2] =... = |zk = 1. 
Now substitute Eq. (N27) into the Eq. (N25) for k = 2,...,p, and set 


gh (A, B63 83 Sp) = Gel A; zh Ar (Ava1 22-020) 23 8.54 Zp) (N29) 


which are defined and holomorphic for A € S,(Ao); zi € C(é — ô), and 


zk E C(€) and, of course, the supremum of | g| on its domain of definition 
can be estimated as 
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1 1 
sup gy’! < Ign leg < Ilo (N30) 
Hence, we can take as parameters A, 21, 22..., Zp) and solve the equation 
z} = 2 eis D(A, EE TS) (N31) 


for z2 as before, etc. After p steps, we will have inverted the full system, in 
the desired form, on the set C (o, € — 6; Ao) under the sole condition 
8re% el8leé g] el! < 1, (N32) 


Which, if ô < 1, and, hence, |g|o¢| < y can be put into the form of Eq. 
(5.11.19) with y < 28. With some care, one could find smaller values for y. 
mbe 
In the same way, one can prove the implicit function theorem mentioned 
in Proposition 21. Since this is a “local theorem” , the proof is actually slightly 
easier than the above. 


6.13 O: Finite-Difference Method 


Consider f € C% (R4) and the equation 


x = f(x), x(0) = Xo (O1) 
To estimate x(r), given 7 > 0, let n = 47, N € Z4, and define inductively 


Xo =x(0), (02) 
Xn =Xn-1+ 7£(Xn_-1), n=1,2,...,N. 
Let 
Ofi(x 
su A = su ; 03 
C= sup Yh a 5 i (03) 


where 2 C RÌ is some convex region where one can a priori guarantee that 
x(t), Vt € [0,7], and Xn, Vn =0,1,...,.N, will fall (2 has to be found in each 
case: out of despair one could always take 2 = R%). Then 


by x(T)| < te — 1) (04) 


This formula gives an a priori estimate of the error that would be committed if 
one iteratively solved Eq. (O1) with the method of Eq. (O2) (“finite-difference 
algorithm” ). It can be used in many of the exercises proposed in this book, 
where the use of a computer is suggested. 

The proof of Eq. (04) is a simple consequence of the considerations and 
proofs given in §2.2-§2.4. 
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PROOF. Let dy “ x, —x(kn), k = 0,1, ..., N. One finds 


dk =xx—1 +Nf (xk-1) — x((k — 1)n) — A f(x((k — 1)n + @)) dO 

n i (05) 

=dķ—1 — Í (£(x((k — 1)n + 0)) — £(xx_1)) dé. 
0 


Hence, applying Taylor’s formula and adding and subtracting suitable terms: 


n 
|d| <|dk-1| + rf |Ix((k — 1)n + 8) — xk-1ı|d0 
0) 


Sidal + f (x(k = 10+ 8) —x((k D+ dead (06) 


n 
< dx-1| Tr Ln|dp-1| + te f édé. 
0) 


where in the last inequality, the derivative of x has been bounded by recalling 
that x = f(x) and |f(x)| < C. If 2 4 R? Taylor’s formula can still be applied 
by the convexity assumption on (2 (by the proofs of appendix A). Then 


LC 
|d| < (1 + Ln)|dk-1| + ain (O7) 


which, by iteration, yields (since dp = 0) 


k-1 
LC . 
Jd] < Sn? JOO + Ln)? = F + En) - 1 (08) 
j=0 


which for k = N, recalling that n = 7, becomes 


Cr Lr, y Lr 
Soya -1] < et 1) (0.9) 
The approximation is therefore of order O(N!) at fixed 7. Since the 
relation x = f(x), by differentiating n — 1 times with respect to t, yields 
expressions for the first n derivatives it is possible to obtain “higher order 
approximations”, O(N~"), by natural modifications of the above algorithm. 
It is also possible to achieve higher order approximations avoiding the 
(often lengthy) calculations of the higher order derivatives and using only 
f(x) (of course evaluated at several points ): the most common algorithm is 
the Runge-Kutta algorithm. Its fourth order version is used in producing the 
graphs of §4.8 in the programs attached to this book. 


lxx — x(7)| 
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6.14 P: Astronomical Data 


(1) Gravitational constant k = 6.67 x 10~8em3/g(sec)?. 

(2) Radius of the Sun: R, = 6.96 x 10°Km. 

Mass of the Sun: M, = 1.99 x ee 

Density of the Sun: ps = 1.41g/cm. (3) Elements of the Planets’ Orbits. 


Planet | Semiaxis | Semiaxis | SiderealPeriod | Eccentricity | Eclipticincl Long. Long. 
u.a. 10° km Days Asc.node | Perigee 


Mercury | 0.387099 ; 87.969 0.206625 7°0'13’.8 | 47°44’66” | 76°40'32” 
Venus 0.723332 . 224.700 0.006793 32339.3 761411 1305120 
Earth 1.000000 : 365.257 0.016729 020441 
Mars 1.52369 a 686.980 0.093357 1510.0 491025 3355819 
Jupiter 5.2028 . 4332.587 0.048417 11821.2 995655 133133 
Saturn 9.540 : 10759.21 0.055720 22926.1 131337 920439 
Uranus 19.18 x 30685. 0.0471 04622.0 734336 16951 

Neptune 30.07 : 60188. 0.0087 14628.1 311351 4410 
Pluto 39.44 3 90700. 0.247 170824 093802 22330 


For the year 1950. From [5] 


) Elements of the Planets’ Orbits. 


Planet | Radius | Radius | Mass | Mass pes Grav. pial Period Equator, s 
fd et eel era 
Mercury 58d.65 
Venus c f ; 3 R ; 243d.2x* 
Earth ; ; ; : . 23h56'4”.1 
Mars ‘ : ; : ; : 24h37' 22!" 6 
Jupiter i 4 $ 15 X 9h50'.5 
Saturn : š x 10h14’ 
Uranus ; g 5 z ¥ 10h49’ xx 
Neptune 5 a ; Š x 15h.81 
Pluto i i 5 ‘ 6d.4 


* Approximate ** Retrograde 
From [5]. 
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5) Satellites of the Planets. 


Planet 


Earth 
Mars 


Jupiter 


Saturn 


Satellite 


Moon 
1.Phobos 
2.Deimos 
1.Io 
2.Europa 


3.Ganymede 


4.Callisto 
5.Amalthea 


1.Mimas 
2.Encelado 
3.Tethys 
4. Dione 
5.Rhea 
6.Titan 
7.Hyperion 
8.lapetus 
9.Phoebe 
10.Themis 
1.Ariel 
2.Umbriel 
3.Titania 
4.Oberon 
5.Miranda 
1.Triton 
2.Nereid 


Av.distance | PeriodSid. 


days 
27.321661 
0.318910 
1.262441 
1.769138 
3.551181 
7.154553 
16.689018 
0.498179 
250.62 
259.8 
738.9 
755. 
260. 
696. 
625. 
0.942422 
1.370218 
1.887802 
2.736916 
4.517503 
15.945452 
21.276665 
79.33082 
550.45 
0.749 
2.52038 
4.14418 
8.70588 
13.46326 
1.414 
5.87683 
500. 


P. on the plane of the planet’s equator 
B. on the plane of the planet’s orbit 
R. retrograde rotation 


From [5]. 


PeriodSyn. 
days 
29d12h44'02'".8 
073926.65 
1062115.68 
1182835.95 
3131753.74 
7035935.86 
16180506.92 
115727.6 
260.0 
276.10 
631.05 
626 
276 
599 
546 
223712.4 
1085321.9 
1211854.8 
2174209.7 
4122756.2 
15231525 
21073906 
79220456 
53616 


SIE 
1.8P 
1.4P 

OP 

OP 

OP 


2122940 
4032825 
81700 
13111536 


5210327 


Inclin. 


547 


Radius 
km 


1738 


0.140 
0.207 
0.13 
0.0196 
0.0045 
0.0000 
0.0021 
0.0009 
0.0289 
0.110 
0.029 
0.166 
300 
0.007 
0.008 
0.023 
0.010 


0.000 
0.7 


15000000 
8000000 
870000 
555000 
250000 
4150 
5000000 
100000 


700 
3000000 
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6.15 Q: Gauss Method for Planetary Orbits 
of an Orbit through Three Observations 


This appendix contains a series of guided problems on the two body central 

motion which is taken from the Gauss’ treatise on the motion of heavenly 
bodies gravitating about the Sun in conic sections (1804). 
1. (Earth motions) the Earth is assumed spherical and its rotation axis has 
a conical precession motion around the axis N celestial north, perpendicular 
to the Earth orbital plane €, ecliptic. The two rotations take place at angular 
velocities, respectively, wp and wp. The velocities are called the diurnal rota- 
tion and the precessional rotation. The second is very slow for the following 
qualitative reasons which could be made quantitative at least as far as the 
orders of magnitude are concerned and even as far as the actual theoretical 
computation of the first order corrections. 

Show that if the Earth was really a perfect sphere then one would expect 
that the Earth axis would stay fixed in orientation (Hint: in a frame of refer- 
ence with center at the Earth center and axes fixed with the fixed stars the 
moment of the forces exercised by the Sun and by the Moon would vanish by 
symmetry and so would the moment of the inertial forces. Hence the motion 
would be that of a sphere with fixed center and no external forces: i.e. the 
axis would be fixed and no precession would be present). 

Show also that if the Earth had cylindrical symmetry around its axis and 
one still neglected the forces exercised by the Sun and the Moon, then one 
would expect it to have a uniform rotation around its axis which in turn 
would rotate at constant angular velocity around a fixed axis (oriented as the 
angular momentum), keeping a constant angle with it. 


2. (further considerations on the Earth spin motion) the following heuristic 
considerations are useful to keep in mind, even though strictly speaking, they 
are not specifically part of the problem of the orbit determination but rather 
pertain to the general problem of fixing the reference frames. 

Since the Earth angular velocity and angular momenta can be taken as 
essentially parallel this movement would simply cause the inclination of the 
Earth axis as well as the intersection between the ecliptic and the plane or- 
thogonal to the Earth axis to have a small motion around their average values: 
it could not be responsible for the precession motion. At best it could account 
for a small motion of the rotation axis around its average position. The preces- 
sion is therefore caused by the action of the forces due to the Sun and Moon 
and to the non spherical symmetry of the Earth, and to the non circularity of 
the Earth and Moon orbits (causing further variations of the forces exercised 
by the Sun and Moon). 

If the forces due to the Sun and to the Moon and to the inertial forces were 
constant in time in the frame of reference with center at the Earth and x-axis 
pointing at the Sun (which they are not because the distance and relative 
positions of the bodies change periodically in time, to a first approximation) 
then the Earth motion would be that of a top subject to a constant torque 
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moment trying to put the Earth equatorial plane on the ecliptic plane, where 
the Sun and Moon can be thought to be (again to a first approximation). 
Hence, as it follows from the theory of the spinning top, the Earth axis rotates 
around the N axis (because the variations of the angular velocity have to 
rotate around the axis such that if the body was oriented parallel to it then 
the moment. of the forces would vanish, which in our case is M since the Earth 
is compressed at the poles). 

However in the theory of the top it emerges that the speed of rotation is 
not uniform: it follows in fact that it periodically changes with time. Hence 
the motion is only to a first approximation uniform and the actual motion 
consists of the above rotation-precession plus a nutation motion which causes 
the precession speed to be altered and the inclination to oscillate quasi peri- 
odically around a mean value. All the above corrections can be given explicit 
theoretical values by using the theory of perturbation of integrable motions in 
the assumption that the motions of the Sun and of the Moon are essentially 
known and given by the Kepler’s laws: this is called the principal correction. 

Again this may not be satisfactory and one could introduce further re- 
finements. We do not enter here into the details of such calculations and we 
summarize the above discussion by saying that one can compile, on theoretical 
grounds tables which allow to determine as a function of time the positions of 
the Earth axis and that of the intersection of the Earth equatorial plane and 
the ecliptic plane. The data given below are deduced from such tables, called 
the Astronomical Ephemeris tables. 


3. (zenithal frame) if O is an astronomical observatory the local system of 
coordinates will have the origin in O and z-axis pointing upwards vertically 
(along a plomb line), i.e. towards the zenith Z. The x-axis will be the horizon 
axis (2, determined by the tangent to the Earth in the plane u of the z-axis 
and the terrestrial axis or north axis; the orientation of the (2-axis will be 
towards south. The plane yu containing the zenith axis and the north axis will 
be called the meridian plane. Draw a graphical representation of the above 
frame. 


4. (equatorial frame) in this frame one takes the origin to be the Earth center 
T the z-axis to be the axis N of the Earth’s rotation oriented towards north. 
The plane ZN cuts the plane orthogonal to the N axis (called the equatorial 
plane) along a line called the equator line which is taken to be the x-axis of 
the equatorial frame. 

Thus the equatorial frame and the zenithal frame are fixed relative to 
each other. Check that the angle ĝo between the equator and the zenith is 
what is commonly called the latitude of the observatory and draw a graphical 
representation of the zenithal and equatorial frames. 


5. (geocentric frame) in this frame the origin is the Earth center T but the 
xy plane is the ecliptic plane (see 1)). The z axis points to the celestial north 
N and the z-axis will be parallel to the intersection between the equatorial 
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plane and the ecliptic plane £. The orientation of the axis is towards the Aries 
constellation, the I’ point, (in fact this axis goes roughly through the Spring 
and Autumn noon positions of the Sun, i.e. through Aries and ???). 

When the Sun crosses this axis one has the equinox, respectively the Spring 
or Autumn equinoxes. The x axis is called the equinox axis and is denoted 
by I’. The angle between the axes M and N axes is the inclination angle ig 
of the Earth axis. Therefore the I’ axis is not fixed in direction but rotates 
around the M axis with angular velocity wp. 

Find a graphical representation for the equatorial and the geocentric 
frames. 


6. (heliocentric frame) this is an inertial frame: its origin is at the center of 
mass of the solar system (which we confuse here with the center of the Sun 
for simplicity). The z axis is orthogonal to the ecliptic and parallel to the M 
axis previously introduced. Thus the xy plane is the ecliptic plane e. The x 
axis will be parallel to the equinox axis I’. 

Knowing that the Earth motion on the ecliptic is a counterclockwise ro- 
tation, as seen standing up on the northern emisphere, check that when the 
Earth crosses the positive T-axis it is the Autumn equinox and that the N 
axis is obtained from the M axis by a clockwise rotation of an angle equal to 
the Earth inclination angle ig (Hint: because the Sun is in Aries in Spring 
and because Winter comes after Autumn). 

Find a graphical representation of the heliocentric and of the geocentric 
frames and mark the point where the Earth would be at the Autumn or Spring 
equinox and the I’ point. 


7. (precession and nutation) since the I point moves because of the precession 
one fixes the x axis of the heliocentric and geocentric systems to be the above 
axes in the positions in which they were at a given time called the epoch E of 
the time measurements, which is taken as the origin of the time. At any other 
time t the position of the I’ axis will form an angle w,(t — E) with the x axis 
of the heliocentric system. To distinguish between the two lines one calls the 
actual intersection between the equator and the ecliptic the apparent equinox 
line, denoted Tapp- 

The -point and inclination tọ in fact change in time also because of the 
nutation and we assume for the purposes of this illustration of Gauss method 
that this change can be desumed from the astronomical ephemeris tables to 
be equivalent to replacing Ap by Ap + An and od, by Wy. 


8. (observations) by observation of a celestial body one means the recording 
of the time ¢ at which it crosses the meridian plane and of the angle ô above 
the horizon on which it is seen at the moment of the crossing. Sometimes one 
records instead of ô the angle dg at which it is seen. Of course the relation 
between the two data is simply: dg = 6 + ĝo — 7/2. The angles 6 and ôg are 
the heights above the horizon or above the equator. Show that one expects to 
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have to make three observations to determine the orbit of the planet (Hint: 
the system has three degrees of freedom, i.e. six parameters). 


9. (apparent positions) call Rr the vector leading from S to T, Ro the vector 
from T to O and B, the unit vector pointing in the apparent position of the 
body. In the zenithal frame it is Ba = (cos 6,0,sin6). Introduce the matrices: 


1 0 0 cosa Osin a cos a— sin a0 
Vi (a) = (rosa sna] V2(a) = ( 0 10 Vi (a) = (sine cos a o) 


Osina cosa — sin a0cos a 0 0 1 
(Q1) 
and the vectors 
nı = (1, 0, 0), n2 = (0, 1, 0), n3 = (0, 0, 1) 


check that the matrices V1, V2, V3 rotate the whole world counterclockwise by 
an angle a around the axes 1, 2,3. 

We call Ar the angle between T and the Iapp-line; the angle of inclination 
of the Earth axis will be ig. Because of the mentioned the precession and 
nutation the angle i between the N axis and the M axis is somewhat different 
from ig. Also the longitude angle between T and the fixed I" line is Ap+Ap+An 
(see 7) above). Let R be the Earth radius. 

Show that: 


Rr =DrV3(Ar + Xp + An) M1 
Ro =RV3(Xp + An)Vi (—)Va(A0) V2(5 — do)ns (Q2) 
T 


Setting A = Rr + Ro, Xa = A+ eB, where ọ is the distance between the 
the heavenly body C and the observatory O, we see that the vector A consists 
of two terms of different order of magnitude (because R/Dr << 1): the first 
is the heliocentric placeof the Earth and the second is the parallax correction. 
The vector B, is the apparent heliocentric place of the heavenly body and @ 
is of course unknown. 


10. (fixed stars aberration) this is a further correction that bears this name 
because it has to be considered even when one observes a fixed star. It is due 
to the finiteness of the light speed c. By the composition law of the classical 
velocity we see that if cB is the light velocity in the heliocentric frame and c'B’ 
is the velocity in the zenith frame and if v is the velocity of the observatory, 
then: 

cB=cBit+v (Q3) 


Show that if one neglects corrections of order (v/c)? then one can write: 


lane tte (Q4) 


Cc C 


B=B,|B—~|+~~B,(1- 
€ C 
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There is no need to use the relativistic velocity composition law as it leads to 
corrections of order O((v/c)”) which have anyway been neglected in deducing 
the last equation. 


11. (computation of fixed stars aberrations) if wr is the diurnal Earth period 
show that the velocity of the observatory can be written: 


vo = wrR cos ĝo V3 (Àp + An) Vi (—i) V3 (Ao )na (Q5) 


Show also that vr can be computed from the fact that the Earth motion is 
Keplerian in terms of the vector Rr, which in the heliocentric frame has polar 
coordinates Dr, AT + An + Ap, by: 


-~ Rr -n3 A Rr 
= Dr— + Dr0—=———— 
VT Tort T Dr 


(Q6) 
Using Eqs. (4.10.11),(4.10.12),(4.10.18), i.e. the fact that the areas constant 
is A= 2r Rg Rm/T where T is the Earth revolution period and Rg, Rm are 
the great axis of the Earth orbit and the minor axis, show that: 


h q 27RoBm (( 1 Wey 4 ia 
T R- Dr ‘Dr Ry 
(Q7) 
>. ItRgRm 1 
Dr = 
T Dr 


where R+, R- denote the perihelion and aphelion distances in the Earth orbit, 
i.e. R+ = Rg(1 + e) if e is the Earth orbit eccentricity. 
Setting D = Pe we find after some algebra: 


2r(1— e?) 2, 1 IERO i 1 
Ve =r p DDS 
-e DD l1l+e 


The sign to choose in (Q.8) is — for observations between roughly the summer 
solstice and the winter solstice and + in the other period (as in this epoch the 
perihelion is early in January a few days after the winter solstice). 

The calculation of the fixed stars aberrations needs not be computed if one 
has astronomical tables containing in some form its value, for the observatory 
of interest. 


12. (time aberrations) If A + oB is the heliocentric position calculated as 
above as a function of the unknown distance o between the Earth and the 
heavenly body, and a s observed at the time ¢ one has to think that in fact 
it provides us with the position really occupied by the heavenly body at the 
time t — o/c, since the speed of light is finite. 

Furthermore sometimes the astronomical tables give the geocentric posi- 
tion of the Sun rather than the heliocentric position of the Earth: in this case 
some obvious changes have to be made to the above formulae and one has to 
add the further correction on the time of the observation obtained by reading 
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in the tables the data relative to the times t + ts if ts is the time necessary to 
the light to travel from the Sun to the Earth, i.e. 500° or ~ 8™. In practice 
this means that one has to change the Earth longitude Ar into Ar + A if A is 
the arc described by the Earth in the time ts: this would be constant if the 
Earth had a circular orbit, but it varies in a way that can be desumed from 
the tables around a mean value of 20.25°, with oscillations between —0.34° 
(at the perihelion) and +0.34° (at the aphelion). 


13. (planetary aberrations) the Earth ecliptic plane (as one should by now 
suspect) is in fact also not fixed in space, mainly because of the perturbations 
caused by the Jupiter attraction, and the Earth is not exactly on the ecliptic 
plane, mainly because of the Moon (in fact, it is the center of mass of the 
Earth-Moon system which is really moving and defining the ecliptic plane): 
hence the Sun has an apparent latitude — 6: this small quantity is directly 
measurable (for the main Moon contribution) or is accessible to theoretical 
analysis and can be found in the tables. One can take it into account (ne- 
glecting terms of O(67)) simply correcting the expression for the vector A by 
adding to it a vector +GDrv3. 


14. (summary of the heliocentric coordinates calculations) a heavenly body 
C observed on the meridian with height above the equator dg at a time t is 
the sum of two vectors A and oB whose Cartesian components components 
in a heliocentric system can be computed, via the astronomical tables which 
provide the orbital data for the Earth. the aberrations etc. In terms of the 
symbols introduced in the previous problems one finds: 


A =DrV3(Ar + Ap + àn )nı + RV3(Ap + An) 
-Vi(-1)V3(Ao)V2(—d0)m1 + 6Drnz 


Ba =V3(Ap + àn) Vi (~i) V3 (Ao) V2(—ôe)n1 (Q9) 
B =B,(1 - VEPs) , tyo 


The @ coordinate is not measurable directly: it will be our main problem 
to show that it can be computed from the data. 

Compute A, B from the following table providing the data of the asteroid 
Juno (observed at Greenwich on October 5, 17, 27 1904; the data (taken from 
the book of Gauss) are referred to the epoch E =1 January 1805): 


t ôg AT Xp i — io 
54105165 —6°40P85 12°28P53.72° 11.878 59.485 
171958105 —8°47P255 24°20P21.545 10.23% 59.265 
2719164185 —10°2P285 34°16P52.215 8.86% 59.065 


Ao Dr B An 
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357°10?22.35° 0.9988899 0.495 —15.43° 
355°43?45.30° 0.9953968 —0.79% —15.41% 
355°11?10.95°  0.9928340 0.15° —15.60° 
and: 
R = 4.1683397 x 1075 e = 0.016729 ôo = 51°28?395 
ig = 23°27? c = 2.0039603 x 10-3au/s Tp = 23564.15 


Rg = 1.496 x 108 km T = 3.15582048 x 107s 


where Dr and R are in astronomical units and Tp is the period of rotation 
of the Earth, (use a computer to write a program producing the Cartesian 
components of A and B). 

15. (planarity condition) let A;, Bi, X;, i = 1,2,3 be the vectors describing 
aberration free heliocentric coordinates of a heavenly body gravitating around 
the Sun according to the Keplerian laws. Then X1, X2, X3 are on the same 
plane. Show that this implies: 


(Xi A X2- k)X; + (Xo A X3- k)Xı + (Xs A Xi - k)X2 = 0 (Q10) 


where k denotes the unit vector orthogonal to the plane X1, Xo, X3 (Hint: 
remark that there exist a, 8 such that X3 = aX, + Xə and substitute in 


(Q.14)). 
If one introduces the oriented areas npq/2 of the triangles Spq,p,q = 1, 2,3 
formed by joining S, p,q, show that (Q.14) becomes: 


n12X3 + n3X1 — 213X2 = 0 (Q11) 


because of the geometrical meaning of Xp A X, - k. 


16. (distance and area relations) show that (Q.11) implies: 


nı: n 
aol = — (B2 A B; : A1) + — (B2 A Bs: A2) — —= (B2 ABs: As) 
23 


23 
n n 
ago = — -2 (Bı A Bs: A1) + (B1 A Bg: A2) — (B1 A B3- Az) (Q12) 
N13 N13 
n n 
0y = — — (B; A Bo x Ai) P — (By X Bo i A2) = (Bı N Bo y A3) 
nı2 N12 


where a = (Bı A B2) : Bs). 


17. (other distance areas relations) an alternative set of relations. which will 
be useful is found my multiplying the first of the (Q.11) vectorially by Bs 
(thus eliminating the explicit dependence on o3 and then scalarly by Bı A Bs 
(so that in the same sense one eliminates 03). Show that in this way one finds: 
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(Bı A B3)? 01 = =o (hs A Bs) £ (Bı A Bs) = (Ay A Bs) 4 (Bı N Bs)+ 
23 


n n 
+ (As A B3) : (Bı TAN Bs) + — 09 (Bo TAN B3) ° (Bı A Bs3)+ 
N23 N23 


N13 


(Bı A Bs)?03 = (As A Bi): (Bi A B3) + ma M ^ B1). (Bi A B3)— 


n n 
= BYNS TAN Bı) : (Bı \ Bs) ame E o2(B2 \ Bi) $ (Bı A Bs3)+ 
N12 N12 


(Q13) 


18. (computation of relevant constants) with reference to the above two prob- 
lems write a program for the computation of the following constants: 
a =(B; A B2) - B3 b = (Bi A Bs) - A2 
c = — (Bı AB3)- At d = —(B, ^ B3); As 
yo = Bı A B3)? 


VL R (A3 A B3) ; (Bı A Bs) Yl = —(Ay A B3) s (Bı A Bs) (Q14) 
V3 A» A Bs) 2 (Bı A Bs) 4 = (Bo A Bs) £ (Bı A Bs) 
y5 = A3 ^A B1). (Bi A Bs) Yo = (A; A B1) (Bi A Bs) 

) 


y7 = — (A2 A B1) - (Bi A B3) ys = —(B2 A^ B1) - (Bi A B3) 


and show that the second of (Q.12) and (Q.13) become: 


n n 
ao2 =b + pas + d- 
N13 N13 
M12 N13 
Yo01 =71 — + y2 + — (73 + 027) (Q15) 
n23 N23 
N23 N13 
Yo03 =Ys + —6 + — (77 + 0278) 
N12 N12 


19. orders of magnitude suppose that the angles between B;, Bj are small 
and so are the angles between A;, Aj, let £ be their order of magnitude. Show 
that the coefficients in the preceding problem have the following orders of 
magnitude in terms of e: 


a=O(e2) b=O(e) c=Ol(e) d=Ole) 

Yo = Ole”) 

n=O) 2=06) w=06) n=0@ (9) 
5 =Ol(€) y=0(e) w=Ol(€) y = Ole’) 


(Hint: to see that a = O(e?) note that the volume of the parallelepiped gener- 
ated by three vectors forming an angle O(c) between each other is in general 
of O(e); however if Ay = Ag = A; it would be a = 0, because then the 
B’s would be in the same plane. Hence the reason why the B’s are not on 
the same plane is because the A’s are not identical; but the A and B vary 
smoothly and the A’s too form between each other angles of O(e)...). 
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20. (necessary accuracy ) if t;,i = 1,2,3 are the observation times let tpg = 
tg — tp and show that the Kepler laws imply: 
Pea t 4 O(6?) (Q17) 
Nrs trs 
if e has the meaning of the previous problem. (Hint: the third Kepler’s law 
gives proportionality between the signed area of the elliptic sector swept by 
C in the time t,, and the area of the sector differs from that of the triangles 
by O(e?)). 

Show that if one neglects O(£?°) and npq/Nrs is replaced by tpq/trs (which 
is directly accessible from the measurements) in (Q.17) then one makes an 
error on @2, for instance, of O(1)! hence we see that we have to find a better 
way to start an approximation. 


21. (how well should o> be known) show that if o2 were known to O(e) then 
the second and third of (Q.15) would permit us to evaluate 01, 03 also to an 
error of O(e€) even using the approximation in which one makes an error of 
order O(e?) in the ratio’s npq/nrs, (for instance replacing it by tpq/trs; (Hint: 
this follows immediately from the estimates in 18)). 

Therefore one has to look for an approximation of o2 within O(e). 


22. (Gauss’ lemma) introduce the ratios Zp between the double of the area 
of the elliptic sector swept by C between the times tp, tq and the quantities 
Npq introduced in 19) above. As already remarked such ratios differ from 1 by 
O(c?) and furthermore the ratios zpqgNpq/tpq are constant in p, q. 

Consider the first expression for 2 in (Q.15) and show that if one replaces 
it with : iis i 

C23 12 223 T N12 
Bes t23 + tia n13 oe 

one makes an error on @2 of order o(£), rather than o(1) (as one could believe 
on first thought on the basis of an argument similar to the one suggested in 
18), until one remarked that: 


cto3 + dtı2 _ Cn23 + dnız £ ti2t23(c — d)(z12 — 223) (Q19) 


t23 + tre n23 + N12 (t23 + t12)(z12t23 + Z23¢12) 


and that the denominator has size O(e?) while the numerator has size 
O(et)(c — d) and, furthermore, that although c, d have size of O(e) their dif- 
ference has size O(e?) because c — d = — (B2 — B3) - (Ai — As). 


22. (Gauss o> equation) let k = Ink?! /T, where Ry is the great semiaxis 


of any major planet orbiting the Sun and T is the corresponding period: 
it follows from (4.10.7) that x? is the product of the sun mass times the 
universal gravitational constant. It follows from the theory of the two body 
problem, as remarked by Gauss that the following basic relation between 
Npq» Zpq, Tq = |Xq| and the angles at S of the triangles S,, with vertices in 
the Sun and the C positions at the tines of the corresponding observations: 
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tikt 
n23 + N12 -1+4 Kty2Kt23 (Q20)) 


N13 22122237 1T2"3 COS fiz COs J23 cos fis 
A guide to the derivation of the above relation is provided in the following 
problem 23). Approximating z and the cosines with 1 and identifying r1, r2,1r3 
show that one finds: 
cto3 + dti2 ( uns) 


1+ 


aoz = b+ 
t23 afa ti2 2r3 


(Q21) 


which is an equation for the unknown 02, because: tpg are known and rg = 
|[A2 +02B2| = (A2 + 02 +202A2: B2)!/?. We neglect here the time aberrations 
which will be corrected only at the end. 

Usually (Q.21) admits only one acceptable solution (show, however that 
it can be rationalized and becomes an equation of eight degree). Show that it 
determines o2 to O(e) (Hint: the zpq and the cosines differ from 1 by O(e?) 
and the r, differ between each other by O(e); furthermore c, d, ktpq have size 
of O(c) and a = (e°)...). Write a computer program to solve the equation 
(Q.21) and find its positive solutions. 


23. (digression on the two body problem to prove (Q.20) rewrite Eq. (4.10.6) 


in the form: 
P 


T 1 — ecos 0 


(Q22) 


where p is the parameter of the ellipse and e is its eccentricity. Then deduce 
that: 


20+ 0-— 0+ — 0— 0+ + 0- 
~o T Tae T 5 POO 
a Loca p2 (Q23) 


o+ =a(lte) b=avy1-— e? p 


where we are denoting: a = major semiaxis of the ellipse, b = minor semiaxis. 

If we call 01,02,03 the three angles that X1, X2, X3 form with respect to 
a given line drawn on their plane (eg with respect to the ascending node with 
the ecliptic plane oriented parallel to ng A (X1 A X3) , and if g denotes the 
angle between the same reference line and the major semiaxis of the elliptic 
orbit of the heavenly body, then the true anomalies of the three positions will 
be 2, = 01 — g, G2 — 9, 03 — g and it will be: 


a 


prt =(1 — e cos 61) 
pr? =(1 — e cos b2) (Q24) 
pr~’ =(1 — e cos b3) 


Note that with the notations of 22) it is 
B3 — B2 = 03 — 02 = 2 f23, 83 — G1 = 03 — 01 = 2 fi3, b2 — Pi = O2 — 01 = 2 fi2 
and check that: 
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Zrsnrs 


= yp (Q25) 


furthermore, since Npg = TpTq Sin 2 fpq, it is: 


Ktrs 


E sin 2 f23 + sin 2 fie — sin 2 fis = 4sin fo3 sin fis sin fierirar3 

on te sin 2 fo3 + ae sin 2fi2 — i sin 2f13 E no3 + N12 — N13 

(Q26) 
and deduce from (Q.26),(Q.25) that (Q.20) holds (Hint: check first the (Q.26) 
by remarking that, by the second Kepler law, the ratio between the area of the 
elliptic sector Sys and the area of the ellipse, i.e. Zrsnrs/2rab coincides with 
the ratio between the time needed to sweep the sector and the heavenly body 
period, i.e. trs/27ra3/? /rk by the (4.10.7); hence (Q.25) and (Q.24) immediately 
imply (Q.26)). Then to get (Q.20) one combines (Q.25),(Q.26), getting: 


= 293 212123112 4 sin fo3 sin fis sin fierirers an 
Kto3kti2 n23 + N12 — N13 


: : : 27 
sin 2 f23 sin 2fi2 sin 2fisrirers (Q ) 
2(n23 + n12 — 213) cos f23 cos f12 cos f13 
i.e. : 
kt23Kt12 sin 2 f23 sin 2 fiz sin 2fisr? rir 
n23 + A — nı3 CRT OT, E TEE TEE we a Gee 
Z23212N23N12 COS f23 cos f12 cos f13rır2r3 (Q28) 


kt23Kt12N13 


2212213 COS J23 COS fiz COS fisrirar3 
because Npg = Tprq Sin 2 fpq)- 


24. (summary of above) the preceding problems permit us to compute a first 
approximation o9 to the distances o; up to errors of order O(c), if € is an 
estimate of the size of the angles between the vectors A; or the vectors B;. 
Hence we have a first approximation X? for the vectors X;. It is useful to 
summarize the above procedure as follows. 

The value of o} is found by solving the equation: 


25 


3 
2r3 


to3 + dt 
a02 stk i (Q29) 


t23 + tre 


where Q = «kt23kt12, determining o2 to O(c), (see problem 22). Set also P = 
t13/t23 and: B 

TPU 2 (Q30) 

2r3 P 

then one realizes that P, W are approximations to n13/n23 and nı2/n23 to 
order O(e?). In fact this has been seen in (Q.17) for P, and W differs from 
n12/N23 because, (see (Q.20)), (1 + Q/2r3) is not (n12 + no3)/N13 = 1+ 
Q /(2r1r2r3212223 cos fiz cos fis cos fo3 nor 1/P is no3/N13, but the latter two 
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quantities differ from the preceding ones respectively to O(€)Q and €?) (see 
the hint to problem 22) and the equation (Q.17)) and Q is of O(e?)) 

Hence by problem 20) it is possible to find 01, 03 within O(e) by using the 
last two relations in (Q.15): 


Yo01 =V W +2 + (93 + Y402)P (Q31) 
Yo03 =75 + WWT! + (y7 + 1802) PWT! 


25. (elliptic elements) at this point we can compute five parameters (i°, \°, 
g°, e°, p?) needed to determine the Keplerian orbit of the heavenly body using 
the information that is an ellipse with focus in the Sun S$ passing through the 
three points determined by the vectors X9: this will be the first approximation 
to the elements of the celestial body. We omit the superscript 0 in what follows 
to simplify the notations. The five elements are: 


i inclination of the orbit plane over the ecliptic 

A longitude of the ascending node between the orbit and the ecliptic 

g angle between the orbit major axis oriented towards the aphelion and 
the ascending node 

e orbit eccentricity 
ellipse parameter 


Denoting m the versor of X?A\X$ and with m’ that of n3/m it is clear that 
m is normal to the orbit plane (by construction the X? have been constructed 
to verify approximately the (Q.11), hence to be almost on the same plane) 
while m’ is the ascending node between the orbit plane and the ecliptic. 

Let 01, 02,63 be the angles formed, respectively, by X$, X2, X? with the 
ascending node m’: they are the angular polar coordinates of the three ap- 
proximate positions in orbit, measured on the orbit plane with respect to the 
ascending node. Let r; = |X;| and check the following relations: 


cosi = ng -m cos À = n; : m’ sin À = ng: m’ (Q32) 
and: 
tants rī | (cos 02 — cos 03) + rz (cos 63 — cos 61) + r3 '(cos 6, — cos 02) 
n St NN 
ry (sin 62 — sin 03) + rz ‘(sin 63 — sin 01) + rz ‘(sin 01 — sin 62) 
eee 
* : (Q33) 


Ea cos(03 — g) — rz cos(61 — g) 
__  cos(03 — g) — cos(A1 — g) 


at cos(@3 — g) — aa cos(@; — g) 


where the ambiguity on the g, defined up to 7a, is to be solved by imposing 
that the eccentricity e be positive; alternatively one can express p via (Q.26), 
etc. (Hint: use the (Q.24) in the form: 
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ry =p | — ep 'cos(6; — g) 


r31 =p" — ep‘ cos(92 — g) (Q34) 
r3! =p" — ep‘ cos(63 — g) 


The tangent of g is found by multiplying the (Q.34) respectively by (cos 02 — 
cos 3), cos 63 — cos 6) and cos 6; — cos #2) and adding the resulting equations 
side by side: the term with p~! disappears and, developing the cos(@; — g) via 
the addition formulae the terms with cosg also simplify. Repeat the scheme 
by multiplying by (sin 62 — sin 63), etc: this time the terms with p~! and sing 
disappear; dividing the two relations thus obtained one finds the first of the 
(Q.33). Once g is known one finds p~! and ep~! from the first and third 
of the (Q.24), for instance, and one gets the last two of (Q.33). One could 
find other essentially equivalent expressions: for instance p can be determined 
also via the (Q26); (they would be really identical if the there had been no 
approximations) ). 

Write a computer program for the calculation of the five elements defined 
above. 


26. (consistency problems) express in terms of X?, i = 1,2,3 the value that 
the ratios 2o: between the areas of the elliptic sectors Spq and the correspond- 
ing triangles take in the ellipse constructed in problem 24), assuming that 
the celestial body moves on it according to the Kepler laws and following the 
hints given below. 

Let a, b, p be the major, minor axes of the ellipse and the parameter; if rg, 0q 
are defined as in problem 24), introduce the quantities Gy, €q, lq as follows: 


Bq = 9-9 Tq = oh e cos By)” * =a(1+ e cos £q) (Q35) 


The above quantities are called true anomaly, it is the polar coordinate of x} 


with respect to the major semiaxis), eccentric anomaly and mean anomaly of 
X9. Check that: 


o ab(Iq — lp) 
“pq = Tprq Sin(Oq — Op) (236) 
(Hint: the average anomaly l is independently defined as the product of 27/T, 
T being the orbital period of the celestial body, times the time elapsed since 
the celestial body passed its aphelion: this notion, naturally arising in the 
theory of the central motions was defined in (4.9.31), where it was denoted 
pı but setting the origin at the perihelion (hence the two definitions differ by 
T). On the basis of this definition one has, therefore: 
dl 2T . 
u T l=0 if 8=0 (Q37) 
The (Q.36) is an immediate consequence of this property of the average 
anomaly which makes it proportional to the time elapsed since the passage 
through the aphelion. In the Keplerian motion the latter time is proportional 
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to the area of the elliptic sector swept by the celestial body, hence the area 
swept between Tp and t4 is to the area of the ellipse as the variation of the 
average anomaly is to 27: i.e. the area swept is tab(ly — l,)/2m. Since, ob- 
viously, the area of the triangle corresponding to the elliptic sector Spq is 
tprą(sin 0q — Ap) /2 the (Q36) follows. 

The true problem is therefore to check the (Q.35), once the average 
anomaly is defined via (Q.37). Recalling (4.10.11),(4.10.12), one finds, using 


(Q.37): 
d T dt r2 (Q38) 
sa A (o2 — o7!) (o7! — 03°) 


where 0_, 04 denote the distances of the perihelion and of the aphelion. 
From the (4.10.16), and (4.10.18) one deduces the following relations be- 
tween the areas constant A, the period T, etc.: 


1 mab +o- 
Ae) gee eS b= Soro 
2 T 2 
az oe b2 (Q39) 
e= =|= pg =a(lte) p= — =al(l—e’) 
0+ + 0- a 


most of which have already been remarked in (Q.23). Hence the (Q.38) can 
be recast in the form: 


dl 2r dB — 2r ab 


dt T dt Tr 

40 
dr _ 2nab {(o,—r)(r—o_) 2ra yae? — (r —a)? (240) 
d ` T 040-7? cae r 


which imply, by dividing between each other conveniently the above relations: 


dl T E p? 1 o (- ere 
dB ab ab(1—ecosf)? (1 — e2 cos)? 
dl r (@41) 


dr ay fae? — (r — a)? 


It follows from the definition of the eccentric anomaly that r = a(1 + e cos £), 
and dr = —aesin€ d£, so that: 

dl E rae sin € d& 

dg ay /a2e? — (r — a)? 
and the final choice of the + sign is based on the remark that the average 


anomaly, the eccentric anomaly and the true anomaly are simultaneously in- 
creasing as one of them increased. 


=1+ecos€ l=€+esing (Q42) 
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The (Q.35), Q.36) are therefore proved, and we have also fond a remarkable 
formula expressing in a Keplerian motion the mean anomaly in terms of the 
true anomaly: the first of (Q.42) gives in fact: 


2)3/2 p dp" Q 
CERE f (1 — e cos 8')? (243) 
which, however, will not be used directly here. 


27. (the gauss’ transformation) Let F be a map transforming a pair (P, Q) of 
numbers into (P’, Q’) defined as follows. 
Given (P, Q) consider the operations: 


(i) solution of the equation for o2: 


A 


c—d 
= — 44 
ao = b+ — PUE (Q44) 
ii)calculation of W via (Q.30), with (P, Q, W) replacing (P, 2, W). 
iii)calculation of 01, 03 via (Q.31), with (P,Q,W) replacing (P,Q,W) 


v)calculation of the parameters 2 fpq = 0p — 0q and Zpq via (Q.35),(Q.36) 


( 
( 
(iv)calculation of the elements via (Q.32),(Q.33). 
( 
(vi)calculation of P’, Q’ via: 


pe zosti2 os Ktyokte3r5 (Q45) 
Z12¢23 7173212223 COS f12 Cos f13 cos f23 
Check that, on the basis of the problems 22),23),26), that the analysis 
developed there can be interpreted as proving that if one sets: 


P=, Q=( 


n23 N12 


me — Dar, (Q46) 


where now npq and ro are the true unknown values of the areas of the triangles 
Spq and of |X|, one has: 


(P,Q) = F(P,Q) (Q47) 


at least if one neglects the time aberration, i.e. if one assumes that the time 
tq4—tp measured between the observations p and q is the true value of the time 
interval between the times in which the celestial body occupies the positions 
p and q, i.e. it can be confused with ty — tp — (0q — 0p)/c (see problem 11)),( 
Hint: check that (Q.44) becomes the first of (Q.15) if (P, Q) are as in (Q.46)). 

Write a computer program realizing the map F defined above nd apply it 
to the computation of (P’, Q’) in the case of the asteroid Juno using the data 
given above. 


28. (Gauss’ algorithm) the preceding problem shows that one has to solve 
(Q.47) as an equation on (P, Q). 
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We have seen that (P,Q) is a good first approximation. It is therefore 
possible to improve it by some standard methods: the simplest is the iteration, 
another possibility is Newton’s method. Both were used by Gauss in his boo. 
And both methods have the drawback that one does not know a priori if they 
will work nor one can easily foretell (if at all possible) an estimate of the time 
necessary to reach a given precision. Very often they are used empirically and 
work if one has a good approximate solution as a starting point. The methods 
may otherwise prove an inconclusive or lead to absurd results. 

We limit ourselves here to the discussion of the naive iteration method. 

Let P = t13/t23 and Q = kty2Kt23, see problem 24), and define: 


(Po, Qo) = (P,Q) (Pk, Qk) = F(Pe-1, Qk-1) k=1,2,... (Q48) 


and it is clear that if this makes sense for all k, i.e. if (Pk-1, Qk-1 is always in 
the domain of definition of F, then the limit of (Pp, Qk) as k > co will be, if 
existing, one solution of the equation and the corresponding data will give the 
ellipse elements and orbital parameters. Note that the domain of definition of 
F has not been explicitly defined so far and consists of the set of pairs (P, Q) 
for which the calculations necessary to evaluate F make sense, i.e. lead to the 
construction of an ellipse: recall that given three points and a focus there may 
be no ellipse passing through them; the whole theory can be easily adapted 
to the case of hyperbolic or parabolic orbits. 

In practice one can proceed by starting the iteration from any point 
(Po, Qo). However if this initial point is not close enough to the solution it 
may happen that (Pk, Qk) wonders out of the definition domain or has some 
strange asymptotic motion: an undesirable event for our purposes. 

The basic difficulty solved by Gauss was to find a method for determining 
in a rather simple way a first approximation when one knows basically nothing 
about the asteroid distance; he also devised the above algorithm based on the 
iteration of a 2-dimensional map, which is remarkably efficient. He showed the 
power of his method by computing the orbit of the first known asteroid Ceres. 

A warning: sometimes the above algorithm may lead to more than one 
solution as it may be that, even if the original determination of the first ap- 
proximation for @2 has a unique acceptable solution, the (Q.47) has more than 
one fixed points. This could provoke also the unpleasant result that modifi- 
cations of the algorithm may lead to different final results. Unfortunately it 
is not easy to develop a general theory of the equation (Q.46) and possible 
ambiguities have to be solved on an empirical basis. 

Use the above scheme to find the elements of the orbit of Juno, on the 
basis of the data in problem 13). 


29. (correction of time aberrations) The correction of the time aberrations 
(problem 11)) can be performed by a small modification of the above itera- 
tive method, very easy to implement numerically. Define, if (P,Q) are as in 
problem 28): 
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(Po, Qo) =(P, Q) 


(Pi, Q1) =F (Po, Qo) (Q49) 


(Presi, Qk+1) =Fe (Pe, Qe) k=1,2,... 


where Fẹ is obtained from F by replacing tp, in (Q.29) and (Q.45) with: 
t) = tg—tp— (P-o) /c. Check that this leads to the aberration correction. 
It is simple but it no longer allows to think that the above procedure as an 
elegant map iteration problem. One could still interpret it as the iteration of 
a map at the price of increasing the dimension of the space on which the map 
acts. 


Apply the above correction to the elements of Juno. 
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6.16 S: Definitions and Symbols 


Si T est un ensemble, et A une partie de T, on notera pa 
la fonction charactéristique de A, si cela n’entraine pas de confusion. 


(Bourbaki, ch. IX) 


C(A) :if AC R? is an open set: the set of the functions on A continuous, together 
with their partial derivatives of all orders; shortened often as C% when A is 


understood. 
C§°(A) _ : if A C R? is an open set: subset of C® (A) consisting of the functions vanishing 


outside a closed bounded set contained in A. 
C)(A) : if A C R is an open set: it is the set of the functions on A with partial 


derivatives of order < k continuous on A, k being a non-negative integer. 

C™~(Q) : with Q C R? arbitrary set with dense interior Qo: set of the functions in 
C™(R*) which vanish outside Q. 

Co(Q) : with Q C R? arbitrary set with dense interior Qo: set of the functions in 
C™ (Qo) vanishing outside some closed bounded set contained in Qo. 

C”’)(Q) : with Q C R? arbitrary set with dense interior: defined as C% (Q), considering 
only the first k derivatives. 

C~(T) : functions of class C® on the d-dimensional torus 7% (see Definition 12, p.100, 
and Definition 13, p.101, §2.21). 

C~ ((0, LJ): functions in C% ([0, L]) vanishing in 0 and L together with all the even-order 


derivatives. 
ce : complex d-dimensional space and (or) complex d-dimensional vector space. 
(O;i,j,k) : orthogonal reference system, O =origin, i,j,k axes unit vectors. 
RI : : real d-dimensional space and (or) real d-dimensional vector space. 
ge : d-dimensional torus with side 27 (see p.101). 
R, RI : real line. 
c,c} : complex plane. 
R4 : interval [0, +00). 
Si : solution flow for an autonomous differential equation. 
za : lattice of the d-tuples of integers. 
Bone : integer numbers. 
Z4 : non-negative integers. 
€,,...  : points or vectors in R7, C4 
y,y,... : points in 74. 


(X‘%)),e7: family of objects X(® parameterized by in the index set J. 


t : real parameter with the interpretation of time. 

x : t-derivative of x. 

x : second t-derivative of x. 

O(E) : quantity of the order of magnitude of €: it means that there is C > 0,€. > 0 
such that O(€) < Clg] if |E] < ĉe. Used when £ is an “infinitesimal” variable. 

o(€) : quantity infinitesimal of higher order compared to €: it means 
limg—o |€|~'0(€) = 0. 

mbe : end-of-proof symbol. 


x-y : scalar product of vectors in RÊ. 
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xAy : vector product of two vectors in R?. 
P-Q : vector whose components in a given frame of reference are the 
differences of the homonymous coordinates of P and Q in the 


same frame of reference. | oi 
: identity or, often, implicit definition. 


= : implicit definition of l.h.s.by the r.h.s. or viceversa. 
e,Zm_: real or imaginary part of a complex number. 

EN : symbols for the set theoretic difference. 

o : partial derivative or boundary of a set 

o : gradient operator. 
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eccentric, 304 
perihelion, 304 
areal velocity, 294 
Arnold, 493 
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on integrability, 363 
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a priori estimate, see estimate 
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invariance, 240 
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principle, see principle 
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for Kepler problem, 304 


variables, 290 aati see 
algorithm modulus, 
finite differences, 544 strength, 378 


attractive manifold, 412 
attractor, 376 


alive force, 144 
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fast, 515 projection, 376 
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stochastic, 121 
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rotaion, 517 


balance 

kinetic-potential energy, 135 
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transformation, 470 
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boundary condition 
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seebaricenter, 148 
centrifugal barrier, 300 
chaos, 445, 446, 452 
Chebyséev inequality, 119 
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theory, 80 
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Euler-Mascheroni, 125 
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ideal approximate, 181 
ideality condition, 210 
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perfection condition, 210 
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Ascoli-Arzeld, 534 
Birkhoff series, 473, 479 
in distribution, 119 
in probability, 119 
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energy-time, 297 
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Dante, 116, 153 
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space, 33, 216, 285 
Deprit variables, 318 
Descartes, 12 
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differential equation 
autonomous, 33 
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finite difference method, 544 
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reversible, 33 
singular, 31 
solution, 14 
uniqueness, 13 
Dirichlet problem, see problem 
distribution 
of a string, 349 
probability, 115 
random variable, 117 
divergence 
of a field, 137 
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multiplicity, 526 
properties, 525 
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energy 
kinetic, 12, 143 
potential, 12, 36, 142 
energy conservation theorem, 11, 144, 
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entropy 
Boltzmann, 355 
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positivity, 360 
equation 
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Euler, 312 
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331, 333, 362 
Hamiltonian, 136, 214 
Lagrangian, 130, 179, 212 
Liouville, 242 
secular, 17, 523 
symbolic of dynamics, 161 
wave, 265 
equilibrium 
stable, 41, 42 
strong, 44 
tolerance, 41 
equinox 
mean, 517 
equivalence 
Lagrangian Hamiltonian, 215 
ergodic, 349 
ergodic, non mixing, 350 
ergodicity 
quasi periodic, 347 
estimate 
a priori, 28 
Euler, 126 
Euler angles, 200 
Euler formula, 55 
Euler-Lagrange equation, see equation 
Euler-Mascheroni constant, see constant 
expansion, Taylor, 520 


feedback, 80 
Feigenbaum constant, 452 
Fermi coordinates, 183 
finite differences, 262, 544 
Runge-Kutta method, 545 
first integral, see constant of motion 
flow, see differential equation 
geodesic, 326 
Hamiltonian, 218 
irrational, 250 
pulsation, 248 
quasi periodic, 248, 288 
solution, 285 
foliation 
into tori, 288 
force, 4 
active, 160 
conservative, 36, 142 
formula 
De Moivre, 55 
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Euler, 55 
Stirling, 125 
Fourier 
series, multidimensional, 103 
quasi periodic series, 105 
series, 59 
series in C” ({0, L]), 536 
theorem in C ({0, L]), 267 
frequency 
of strings, 348 
ergodicity, 347 
not well defined, 360 
of visit, 342 
of visit, 342 
quasi periodic, 288 
well defined, 349 
friction, 43, 74 
anchor escapement, 88 
and Lagrangians, 138 
gyroscope, 365 
time scale, 53 
function 
C™ on regular surface, 258 
C® bounded support, 521 
CE (2), 258 
analytic, 337, 481 
generating, 222, 238 
holomorphic, 481 
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implicit, 528 
Lagrangian, 127 
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Lyapunov, 387 
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on the ellipsoid, 327 
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triagle, 231 
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Lobatchesky, 230 
noneuclidean, 230 
global solution, 28 
golden number, 98 
golden section, see golden number 
gyroscope, 309, 365 


integrability, 310 
Kowaleskaia, 332 


Hamiltonian 
regular, 214 
harmonic mode, see harmonic 
component 
harmonic component, 59 
harmonic oscillator, see oscillator 
Huygens, 12 


ideal constraint condition, 210 
identity 
Jacobi, 242 
independence 
rational, 290, 342 
independent events, 118 
inequality 
Cauchy-Schwartz, 519 
Chebyséev, 119 
isoperimetric, 356 
inertia matrix, 308 
inertial frame, 5 
integrability 
analytic, 290, 355 
anisochronous systems, 364 
atom in electric field, 333 
Calogero lattice, 331 
canonical, 290 
canonical, rigid body, 320 
conditions, 289 
criterion, 335, 359 
ellipsoid geodesics, 327, 329 
geodesics on torus, 329 
heavy gyroscope, 330, 331 
ionized hydrogen, 333 
isochronous, 290 
Kowaleskaia gyroscope, 332 
rigid body, 311 
Toda lattice, 331 
integrable system, see motion 
involution, 363 
anisochrony, 363 
irrational number, quadratic, 99 
isochrony, 48, 288, 491, 492 


Jacobi identity, see identity 


Kepler laws, 299 


Kepler problem, action-angles, 304 
kinetic matrix, see matrix 
kinetic-potential energy balance, 135 
Kolmogorov 

iteration, 496 


Lagrangian 
density, 151 
function, 151 
regular, 212 
rigid body, 309 
Laplace 
limit, 304, 486 
operator, 262 
law 
force, 142 
Kepler, 299 
large numbers, 119 
of mechanics, 5 
Legendre duality, 136, 216 
Legendre trasformation, 216 
Levi-Civita, 486 
Liouville 
operator, 242 
Liouville theorem, see theorem 
local solution, 27 
Lorenz model, 444 


Mach, 9 

manifold 
attractive, 412, 428 
central, 430 
invariant, 412 
stable, 430 
unstable, 430 

map, 219 
canonical homogeneous, 225 
canonical permutation, 239 
complete canonicity condition, 234 
completely canonical, 220 
completely canonical example, 241 
contact, 234 
Deprit canonical, 315, 320 
Henon, 457 
integration, 288 
linear canonical, 234 
Poincaré, 440 
relatively canonical, 219 
symplectic, 234 
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matrix 

inertia, 308 

kinetic, 177 

Lyapunov, 441 

positive definite, 525 

stability, 382 

wronskian, 17, 69 
Maupertuis pinciple, see principle 
maximal solution, 27 
method 

Runge-Kutta, 545 
mixing, 349 
mode 

excited, 249 

normal, 246, 284 

spatial structure, 263 
model, 2 

anchor escapement, 86 

elastic film, 257 

elastic string, 256 

five modes NS, 374 

Lorenz, 374 

seven modes NS, 374 
momentum 

angular, 148 

generalized, 217 

linear, 148 
motion 

asymptotically periodic, 57 

central, 292 

conservative, 36 

constant of, 287 

constraint, 157 

constraint compatible, 159 

deferent, 476 

epiycle, 476 

Gauss’ method, 548 

history, 334 

integrable, 288 

periodic, 35 

precession, 476 

quasi periodic, 248, 288, 311 

small oscillations, 65 

varied, 127 
multi periodic, see function 


Navier-Stokes 
5 modes truncation, 446 
7 modes truncation, 452 
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Newton, 9 
node line, 200, 296 
non integrability 
criterion, 359 
geodesics, with negative curvature, 
360 
non isochrony, see anisochrony 
ntation 
constant, 517 
nutation, 325, 514 
Moon, 516 
solar, 515, 517 


observable, 109 
history, 109 
oscillation 
fatigue, 170 
isochrony, 288 
pulsation, 284 
small, 65, 284, 288 
oscillator 
harmonic, 48 
boundary condition, 256 
Duffing, 513 
elastic body, 253 
elastic film, 254 
elastic string, 253 
harmonic, 288 
linear coupled, 246 
proper time scale, 53 
resonant, 75 
resonating, 492 


paradox 
Zermelo, 219 
partition 
analytically regular, 341 
path 
mechanical, 230 
optical, 231 
pendulum, 65 
damped, 70 
Escande-Doveil, 513 
periodically forced, 74 
periodic motion, superposition, 93 
perturbation 
algorithms, 473 
regularized, 496 
phase 


space, 216, 285, 290 
phase space, 136 
partition, 341 
Phoedrus, 440 
planetary orbit determination, 548 
Poincaré, 493 
point mass mechanics, 3 
Poisson 
bracket, 237, 362 
precession, 514 
equinoxes, 517 
Hamiltonian, 321 
lunisolar, 324 
solar, 321 
prime integral, see constant of motion 
principle 
of mechanics third, 147 
action, 130 
conservation of difficulty, 155 
D’Alembert, 161 
Fermat, 230 
Hamilton, 136, 222 
homogeneity space-time, 8 
least action, 132, 326 
least action with constraints, 163 
Maupertuis, 229, 326, 336 
of inertia, 6 
of mechanics, first, 5 
of mechanics, second, 5 
of mechanics, third, 6, 146 
virtual work, 161 
probability distribution, see distribution 
problem 
Dirichlet, 263, 277 
Kepler, 299 
two bodies, 292 
proof 
constructive, 19 
Ptolemy, 476 
pulsation, 100, 248, see oscillation, 288 


quadrature, 12, 22, 36, 320, 329, 515 
quasi periodic function, see function 


random variable, 117 

rational approximation, best, 97 
rational indepedence, 352 

rational independence, 105, 250, 335 
reference system, 3 


relation 

canonical commutation, 237 
renormalization group, 495 
resonance, 76, 113 
reversible equation, see differential 

equation 

Riemann measurability, 343 
rigid body integrability, 311 
rotation 

axis, 517 

daily, 517 

mean axis, 517 


satellite 

artificial, 303 
secular equation, see equation 
sequence 

mixing, 349 
set 

analytically regular, 338 

attractor, 376 

bi-invariant, 375 

invariant, 375 

invariant stable, 375 

locally analytic, 338 
solution flow, see differential equation, 

285 

space 

C0 (2), 255 

data, 216, 285 

phases, 136, 216, 285 
stability 

anchor escapement, 88 

clock, 87 

matrix, 441 

of a map, 441 
stable equilibrium, see equilibrium 
stationarity point, 128 
Stirling formula, see formula 
string 

distribution of, 349 

ergodic, 349 

frequency, 348 

homologous to a given string, 349 

of symbols, 349 
surface 

codimension, 171 

locally analytic, 338 

regular, 171 
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system 
anisochronous, 363 


theological, animistic and mystical 
conceptions in mechanics, 244 
theorem 
Euler, 92 
alive force, 144 
analytic implicit functions, 483, 540 
Arnold, 488 
Arnold, on constraints, 186 
Ascoli-Arzelá, 534 
baricenter, 148 
central manifold, 430 
Deprit, 319 
energy conservation, 11, 162 
Fourier series, 60 
global implicit functions, 533 
Hopf bifurcation, 431 
Hopf-Anosov-Sinai, 360 
implicit functions, 528 
König, 205 
KAM, 461 
Koushnirenko, 355 
Lagrange on strings, 265 
Liouville, 137, 218, 242 
Liouville on integrability, 362 
Lyapunov, 382 
Lyapunov 2d, 387 
recursion, 219 
Shannon-McMillan, 360 
small denominators, 488 
Vitali convergence, 510 
tidal stress, 300 
time absolute, 3 
time evolution flow, 285 
tolerance, see equilibrium 
torus, 101 
rotation, 248 
standard, 101 
transformation, see map 
Birkhoff, 470 
trigonometry, spherical, 319 
Truesdell, 9 
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action, 464 

variables 
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variation of motion, see motion velocity, 278, 282 

variational minimum, 129 Webster, 78 

vibration work 
fatigue, 170 conservative force, 145 
normal mode, 246 of a force, 113, 144 

virtual, 162 

oe tis 265 wronskian matrix, 17 
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